Structured Outputs

Get consistent, parseable responses from AI models. Learn JSON mode, schema validation, and techniques for reliable structured data extraction.

November 24, 2025
structured-outputsjsonschemadata-extractionprompt-engineering

Structured Outputs

Getting consistent, machine-parseable responses from AI models instead of free-form text.

Why Structured Outputs Matter

Free-form text is natural for human consumption but unreliable for automation. Structured outputs solve:

ProblemFree-Form TextStructured Output
ParsingRegex, NLP, or manual extraction neededDirect JSON parsing
ConsistencyFormat varies between generationsSchema-enforced structure
ValidationAd-hoc checks for completenessSchema validation at the field level
AutomationRequires human review at each stepDirect integration into pipelines
TypingAll values are stringsNative data types (number, boolean, array)

Approaches to Structured Outputs

ApproachMechanismReliability
JSON ModeModel instructed to output valid JSONGood — relies on prompting
Schema-ConstrainedModel constrained to a grammar/schemaVery High — enforced at generation level
Tool/Function CallingOutput formatted as a function callHigh — built into model training
Post-ProcessingParse and validate after generationVariable — depends on parser
Two-StepGenerate text, then extract structureModerate — two chances for error

Schema Design Principles

Be Specific: Define exact field names, types, and constraints. A precise schema produces more reliable outputs than a loose one.

Keep It Flat: Deeply nested schemas increase error rates. Flatten where possible.

Use Optional Fields: Mark fields as optional when the data may not be available, rather than relying on null sentinels.

Provide Examples: Include a sample valid output in the prompt to show the exact format expected.

Note:

Validate everything. Always validate structured outputs against your schema before using them. Even with JSON mode or constrained generation, models can produce unexpected values.

Common Patterns

Retry with Schema: When validation fails, feed the error back to the model with a request to fix it. Models are surprisingly good at self-correcting when shown the validation error.

Streaming Parsers: For real-time applications, use a streaming JSON parser that can handle partial outputs, so you can begin processing before the full response arrives.

Hybrid Approach: Generate structured data for automation while keeping a free-text field for human-readable explanations alongside the structured fields.

Real-World Examples

Customer Support Classification

{
  "intent": "refund_request",
  "sentiment": "frustrated",
  "priority": "high",
  "customer_id": "CUST-12345",
  "issue_summary": "Product arrived damaged",
  "suggested_action": "initiate_refund",
  "response_tone": "empathetic"
}

Schema: intent (enum), sentiment (enum), priority (enum), customer_id (string), issue_summary (string, max 200 chars), suggested_action (enum), response_tone (enum)

Data Extraction from Documents

{
  "invoice_number": "INV-2024-0789",
  "vendor": "Acme Corp",
  "date": "2024-03-15",
  "line_items": [
    {"description": "Widget A", "quantity": 10, "unit_price": 25.00, "total": 250.00}
  ],
  "subtotal": 250.00,
  "tax": 20.00,
  "total": 270.00,
  "currency": "USD"
}

Schema: invoice_number (string), line_items (array of objects with type-constrained fields), totals (number with precision)

Validation Strategies

StrategyWhen to UseExample
Schema ValidationAlwaysValidate against JSON Schema before using data
Type CheckingPost-parsingEnsure number fields are actually numbers, not strings
Range ValidationNumeric fieldsquantity between 0 and 10000, price greater than 0
Enum CheckingCategorical fieldsVerify value is in the allowed set
Cross-Field ValidationRelated fieldssubtotal + tax == total

Topics in This Section

  • JSON Mode - Using JSON mode for reliable, parseable AI outputs with schema definition, validation, and error handling patterns