Plan-and-Solve: Two-Stage Decomposition

Plan first, then execute. Address CoT's missing-step and calculation errors with a structured two-phase approach that outperforms zero-shot chain-of-thought across 10 datasets.

June 11, 2026
plan-and-solvereasoningchain-of-thoughtzero-shotdecompositionprompt-engineering

The Core Idea

Plan-and-Solve (Wang et al. 2023) fixes a fundamental weakness of zero-shot CoT: the model skips steps and makes calculation errors because it thinks and executes simultaneously. The fix is separation — plan the whole solution first, then execute each step with the plan as a guide.

Zero-Shot CoT: "Step 1, Step 2, Step 3..."
               → Missing steps, calculation errors, semantic confusion

Plan-and-Solve: Phase 1: "Here's my plan: ..."
                Phase 2: "Step 1: ... Step 2: ... (following the plan)"
               → Fewer missing steps, better organization

The Three CoT Failure Modes

Plan-and-Solve was designed to address three specific failure patterns in zero-shot CoT:

Failure ModeWhat HappensExample
Missing-step errorsModel jumps from problem to answer, skipping intermediate reasoning"A train leaves at 3pm traveling 60mph. It's 4pm now. Distance = 60 miles." (wrong — didn't account for time properly)
Calculation errorsModel does arithmetic wrong during free-form reasoning"48 × 37 = 1,764" (off by 12 — common digit-swap hallucination)
Semantic misunderstandingModel misinterprets the problem structureTreating a comparison problem as a counting problem

PS addresses missing-step errors. PS+ (with detailed instructions) addresses calculation and semantic errors.

Prompt Templates

PS Prompt (Plan-and-Solve)

Q: {problem}

Let's first understand the problem and devise a plan to solve the problem.
Then, let's carry out the plan to solve the problem step by step.

PS+ Prompt (Plan-and-Solve with Extra Guidance)

Q: {problem}

Let's first understand the problem, extract relevant variables
and their corresponding numerals, and devise a plan.
Then, let's carry out the plan, calculate intermediate variables
(pay attention to correct numeral calculation and commonsense),
solve the problem step by step, and show the answer.

The PS+ variant adds three guardrails:

  • Extract variables explicitly — prevents overlooking numbers in the problem
  • Calculate intermediate variables — forces showing work, reducing mental arithmetic errors
  • Pay attention to commonsense — reminds the model to sanity-check results

PS vs PS+ vs CoT: An Example

Problem: John has 5 boxes. Each box contains 12 apples.
         He gives away 3 boxes. How many apples does he have left?

┌─────────────────────────────────────────────────────────────┐
│ Zero-Shot CoT                                                │
│ "Let's think step by step. John had 5 boxes × 12 = 60       │
│  apples. He gave away 3 boxes, so he has 2 × 12 = 24."      │
│ → Answer: 24 ✓ (but reasoning skipped the allocation step)   │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│ Plan-and-Solve (PS)                                          │
│ "Let's first understand the problem and devise a plan...     │
│                                                              │
│ Plan:                                                        │
│ 1. Calculate total apples John starts with                   │
│ 2. Calculate boxes remaining after giving away               │
│ 3. Calculate apples in remaining boxes                       │
│                                                              │
│ Step 1: 5 boxes × 12 apples per box = 60 total apples       │
│ Step 2: 5 boxes - 3 boxes = 2 boxes remaining               │
│ Step 3: 2 boxes × 12 apples = 24 apples                     │
│ Answer: 24 ✓"                                                │
└─────────────────────────────────────────────────────────────┘

The plan makes the reasoning auditable. You can verify each step independently.

Implementation

def plan_and_solve(llm, problem: str, use_ps_plus: bool = False):
    """Two-stage: plan first, then execute."""

    # Phase 1: Plan
    if use_ps_plus:
        plan_prompt = f"""Q: {problem}

Let's first understand the problem, extract relevant variables
and their corresponding numerals, and devise a plan."""
    else:
        plan_prompt = f"""Q: {problem}

Let's first understand the problem and devise a plan
to solve the problem."""

    plan = llm.generate(plan_prompt)

    # Phase 2: Execute the plan
    solve_prompt = f"""Q: {problem}

{plan}

Then, let's carry out the plan, solve the problem step by step,
and show the answer."""
    solution = llm.generate(solve_prompt)

    return {
        "plan": plan,
        "solution": solution,
        "variant": "PS+" if use_ps_plus else "PS"
    }

Handling Plan Failures

Plans aren't perfect. When the generated plan is wrong, you need a fallback.

Replanning Trigger

def plan_with_fallback(llm, problem: str):
    """Plan, execute, and replan if execution reveals plan flaws."""
    plan, solution = plan_and_solve(llm, problem)

    # Check if the plan was followed or if execution contradicted it
    if "I realize the plan is wrong" in solution or \
       "actually" in solution.lower() and solution.lower().count("actually") > 1:
        replan_prompt = f"""The original plan for this problem had issues.
Problem: {problem}
Original plan: {plan}
Issue found: The plan doesn't account for all constraints.

Create a corrected plan and solve again:"""
        plan, solution = plan_and_solve(llm, replan_prompt)

    return plan, solution

Common Plan Failures

FailureSymptomFix
Missing constraintPlan has N steps but problem has N+1 requirementsPS+ with explicit variable extraction
Wrong orderPlan puts dependent steps in wrong sequenceAsk "Does step K depend on step J? If so, reorder."
Overly vague"Solve the problem" as a plan stepRequest "specific, numbered steps with sub-goals"
Circular planPlan references outputs that haven't been computedAdd "Verify each step's prerequisites are met"

When Plan-and-Solve Wins

Strongest on:

  • Multi-step math word problems (GSM8K, SVAMP, MultiArith)
  • Symbolic reasoning with many sequential operations
  • Long-form generation where structure prevents rambling
  • Tasks where CoT consistently misses intermediate steps

No advantage on:

  • Single-step problems (classification, factual lookup)
  • Tasks where the reasoning is trivial and decomposition adds overhead
  • Creative generation where rigid planning kills fluidity

Plan-and-Solve vs. Other Techniques

TechniqueStructureZero-Shot?Key StrengthKey Weakness
Standard CoTLinear chainYes (zero-shot) / No (few-shot)Simple, universalMissing steps, calc errors
Plan-and-SolvePlan → ExecuteYesStructured, auditablePlan rigidity
Least-to-MostDecompose → Solve sequentiallyNo (needs exemplars)Harder-than-exemplar generalizationDecomposition can fail
Tree-of-ThoughtBranch → Evaluate → PruneNo (needs evaluator)Explores alternativesHigh cost, needs good scorer

Plan-and-Solve vs. Least-to-Most

Both decompose problems, but the decomposition strategy differs:

AspectPlan-and-SolveLeast-to-Most
Decomposition styleTop-down plan, then executeBottom-up: easiest subproblem first
Exemplar dependencyZero-shot worksFew-shot (needs decomposition examples)
Problem scopeFixed-complexity problemsProblems harder than training examples
Cost2 LLM calls per problemN calls (one per subproblem)
Best forMath reasoning, structured tasksSCAN, compositional generalization

Production Integration

LangChain adopted Plan-and-Solve as Plan-and-Execute. In practice, you can use it as a drop-in replacement for zero-shot CoT:

# Replace this:
response = llm.generate(f"{problem}\nLet's think step by step.")

# With this:
response = plan_and_solve(llm, problem, use_ps_plus=True)

The tradeoff: PS uses roughly 2x the tokens of CoT (plan + execution), but the accuracy gain on multi-step problems typically justifies it.