Back to GPT Codex
GPT CodexIntermediate4 min read

Codex Task Design — Write Prompts Like Issues, Not Wishes

Size Codex tasks correctly, reduce rework, and get cleaner diffs by using issue-shaped prompts with clear constraints and done conditions.

promptingplanningdecompositionworkflow

Official References: Best Practices · How OpenAI uses Codex · Subagents

Curriculum path

  1. Codex Getting Started — basic success loop
  2. Codex Instructions — persistent repo rules
  3. Codex Sandboxing — execution boundaries
  4. Codex Task Design — scope, prompts, and done conditions ← You are here
  5. Codex Skills — package repeated workflows
  6. Codex MCP — external context and tooling
  7. Codex Reviews and Automations — repeatable engineering loops

Official docs used in this guide

The core rule: remove guessing

Codex is strong at execution. It gets weaker when you force it to infer what success means.

A good prompt is not poetic. It is specific enough that the model has less room to invent the task.

The simplest useful structure

Codex best practices recommend issue-shaped prompts.

Goal:
What should change?
 
Context:
Which files, errors, or existing patterns matter?
 
Constraints:
What must not break or change?
 
Done when:
What proves the task is finished?

This is deceptively simple. Without Goal, there is no target. Without Done when, there is no finish line.

Bad prompt vs better prompt

Too vague:

improve the cart flow

Much better:

Goal:
Add optimistic updates to the cart drawer quantity control.
 
Context:
- UI lives in src/components/cart/
- server action lives in src/app/actions/cart.ts
- follow the loading pattern already used in wishlist
 
Constraints:
- do not change the public API shape
- do not add a new state library
 
Done when:
- quantity updates feel immediate
- failures roll back cleanly
- lint, build, and relevant tests pass

Size tasks around one verification loop

A practical sizing rule is not “one feature” but one reviewable verification loop.

Good tasks:

  • a refactor inside one bounded folder
  • a reproducible bug fix
  • docs updates tied to a specific API change

Bad tasks:

  • redesign auth, onboarding, and billing together
  • modernize the whole app
  • clean all technical debt

If one lint/build/test loop cannot tell you whether the task is healthy, the task is probably too broad.

Some work should start with a plan only

Ask for a plan first when:

  • the change crosses multiple domains
  • architecture decisions matter more than coding speed
  • rollback would be expensive
  • there are multiple reasonable implementation shapes

A useful opener is:

Do not implement yet. Propose a plan first.
Include risks, touched files, and verification steps.

Local examples beat abstract advice

Codex responds much better to “follow this file” than to “use our usual architecture.”

Better anchors:

  • “follow the pattern in src/features/profile/actions.ts”
  • “match the format already used in docs/releases/”
  • “reuse the Button variant naming pattern”

The more concrete the reference point, the less Codex has to guess.

Constraints are a quality tool

Constraints are not friction. They are direction.

High-value constraints include:

  • do not add dependencies
  • preserve the public API shape
  • stay inside one directory
  • do not touch generated files
  • keep the diff small

When to move beyond a plain prompt

If the same kind of task repeats, a better surface may exist:

  • repeated workflow → Skills
  • outside context needed → MCP
  • parallel exploration or implementation helps → ask for subagents explicitly

The Subagents docs note that Codex only spawns subagents when you ask it to, and they inherit the current sandbox policy. That means parallelism is powerful, but not free.

Use Best-of-N for uncertain problems

OpenAI highlights Best-of-N for harder work. It helps when:

  • naming or structure matters a lot
  • several solutions look reasonable
  • you want to compare options before merging

Fast preflight checklist

Before you send a non-trivial task, check these:

  • Is the goal singular?
  • Did you name relevant files or folders?
  • Did you list the important constraints?
  • Did you define success in observable terms?
  • Is the change small enough for one review loop?

Great Codex task design is not a hidden prompt trick. It is the discipline of making the task clear even for another engineer.

Connected Guides