Module 530 min read · Building with AI APIs
Structured Outputs
Free-form text output breaks production pipelines. When your application needs to parse the model's response — extract fields, populate a database, drive UI components, or chain into another system — you need structured, reliable output. This module covers how to guarantee the model returns data in the exact shape your code expects.
Why unstructured output breaks production apps
Asking a model to "return JSON" with a plain text instruction is fragile. The model might prepend explanation text ("Here is the JSON you requested:"), wrap it in a markdown code fence, include trailing commentary, use slightly different field names than specified, or sometimes just refuse and return prose. Any of these breaks a json.loads() call.
The failure modes are insidious because they often work 95% of the time in testing and break on edge cases or slightly different phrasings in production. A production-grade structured output system needs to be close to 100% reliable — not "usually works."
The naive approach fails silently
The pattern of "tell the model to return JSON in the prompt and hope it does" produces code that works in demos and breaks in production. A single malformed response can crash an automated pipeline, corrupt a database, or silently return null values that propagate downstream.
JSON mode: a safety net
The simplest structured output technique is JSON mode. Pass response_format={"type": "json_object"} to OpenAI and the model is constrained to produce valid JSON — guaranteed. You still need to specify in your prompt what fields you want, but the parse will always succeed.
from openai import OpenAI
import json
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": (
"Extract the requested information and return ONLY a JSON object. "
"Do not include any other text."
)
},
{
"role": "user",
"content": (
"Extract: company_name, revenue_usd (number), fiscal_quarter (string), "
"year (integer) from this text:\n\n"
"Apple Inc. reported Q2 2025 revenue of $95.4 billion."
)
}
],
response_format={"type": "json_object"}, # guarantees valid JSON output
temperature=0
)
# This will always parse — JSON mode guarantees valid JSON
data = json.loads(response.choices[0].message.content)
print(data)
# {"company_name": "Apple Inc.", "revenue_usd": 95400000000, "fiscal_quarter": "Q2", "year": 2025}
JSON mode guarantees syntactically valid JSON but not semantic correctness — the model can still produce wrong field names, wrong types, or hallucinate values. For full schema enforcement, use structured outputs with a schema.
Structured outputs with JSON schema (OpenAI)
OpenAI's structured outputs feature takes this further: you provide a JSON Schema describing the exact structure you want, and the model is constrained by the decoding algorithm to produce output that matches that schema. This is enforced at the token level, not just requested — it is mathematically guaranteed.
from openai import OpenAI
import json
client = OpenAI()
# Define the exact schema for the output
extraction_schema = {
"type": "object",
"properties": {
"company_name": {
"type": "string",
"description": "Full legal name of the company"
},
"revenue_usd": {
"type": "number",
"description": "Revenue in USD dollars (not billions)"
},
"fiscal_quarter": {
"type": "string",
"enum": ["Q1", "Q2", "Q3", "Q4"],
"description": "Fiscal quarter"
},
"year": {
"type": "integer",
"description": "Fiscal year as 4-digit integer"
},
"confidence": {
"type": "string",
"enum": ["high", "medium", "low"],
"description": "Confidence level in the extraction"
}
},
"required": ["company_name", "revenue_usd", "fiscal_quarter", "year", "confidence"],
"additionalProperties": False
}
response = client.chat.completions.create(
model="gpt-4o-2024-08-06", # structured outputs require this model or later
messages=[
{"role": "system", "content": "Extract financial data from the provided text."},
{"role": "user", "content": "Apple Inc. reported Q2 2025 revenue of $95.4 billion."}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "financial_extraction",
"schema": extraction_schema,
"strict": True # enforce schema strictly
}
},
temperature=0
)
data = json.loads(response.choices[0].message.content)
print(data["company_name"]) # "Apple Inc."
print(data["fiscal_quarter"]) # always one of: Q1, Q2, Q3, Q4 — enum enforced
print(data["confidence"]) # always one of: high, medium, low
Pydantic models for type-safe structured outputs
Pydantic is a Python library for data validation using type annotations. Combined with structured outputs, it gives you a complete pipeline: define the shape once in Python, get it enforced by the API, and parse the result directly into a typed Python object. This is the pattern used in most production systems.
pip install pydantic openai
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Literal, Optional
import json
client = OpenAI()
# Define your output schema as a Pydantic model
class FinancialExtraction(BaseModel):
company_name: str = Field(description="Full legal name of the company")
revenue_usd: float = Field(description="Revenue in USD (not billions or millions)")
fiscal_quarter: Literal["Q1", "Q2", "Q3", "Q4"]
year: int = Field(ge=1900, le=2100, description="Fiscal year")
currency: Literal["USD", "EUR", "GBP", "JPY"] = "USD"
source_confidence: Literal["high", "medium", "low"]
notes: Optional[str] = Field(default=None, description="Any caveats about the extraction")
class NewsArticleEntities(BaseModel):
"""Entities extracted from a news article."""
companies: list[str]
people: list[str]
locations: list[str]
key_dates: list[str]
sentiment: Literal["positive", "negative", "neutral", "mixed"]
def extract_with_schema(text: str, schema_class: type[BaseModel]) -> BaseModel:
"""
Extract structured data using Pydantic schema validation.
Uses OpenAI's parse() helper for clean Pydantic integration.
"""
response = client.beta.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{
"role": "system",
"content": "Extract the requested structured data from the provided text."
},
{"role": "user", "content": text}
],
response_format=schema_class,
temperature=0
)
# .parse() returns a parsed Pydantic object directly
parsed = response.choices[0].message.parsed
if parsed is None:
raise ValueError("Model refused to return structured output")
return parsed
# Usage
text = "Apple Inc. reported Q2 2025 revenue of $95.4 billion dollars, beating analyst estimates."
result = extract_with_schema(text, FinancialExtraction)
# result is a fully typed Python object — no dict access, no parsing
print(result.company_name) # str: "Apple Inc."
print(result.revenue_usd) # float: 95400000000.0
print(result.fiscal_quarter) # Literal: "Q2"
print(result.source_confidence) # Literal: "high"
print(result.model_dump_json()) # serialize back to JSON
# News entity extraction
news = "Elon Musk and Jensen Huang met in Austin, Texas on March 14, 2025 to discuss AI chip supply chains."
entities = extract_with_schema(news, NewsArticleEntities)
print(entities.companies) # ["Tesla", "NVIDIA"] (inferred)
print(entities.people) # ["Elon Musk", "Jensen Huang"]
print(entities.locations) # ["Austin, Texas"]
Structured outputs with Anthropic
Anthropic does not have a native structured outputs API with schema enforcement. The recommended approaches are: strongly prompted JSON (works well with Claude's instruction-following), or the tool use trick — defining a tool with the desired schema and forcing the model to call it.
import anthropic
import json
from pydantic import BaseModel
client = anthropic.Anthropic()
class ProductReview(BaseModel):
sentiment: str
rating: int
key_positives: list[str]
key_negatives: list[str]
summary: str
def extract_review_anthropic(review_text: str) -> ProductReview:
"""Extract structured review data using Claude with the tool-use trick."""
# Define the extraction as a tool — forces Claude to produce schema-valid JSON
tool = {
"name": "extract_review",
"description": "Extract structured information from a product review",
"input_schema": {
"type": "object",
"properties": {
"sentiment": {
"type": "string",
"enum": ["positive", "negative", "neutral", "mixed"]
},
"rating": {
"type": "integer",
"minimum": 1,
"maximum": 5,
"description": "Inferred star rating 1-5"
},
"key_positives": {
"type": "array",
"items": {"type": "string"},
"description": "List of positive aspects mentioned"
},
"key_negatives": {
"type": "array",
"items": {"type": "string"},
"description": "List of negative aspects or complaints"
},
"summary": {
"type": "string",
"description": "One sentence summary of the review"
}
},
"required": ["sentiment", "rating", "key_positives", "key_negatives", "summary"]
}
}
response = client.messages.create(
model="claude-opus-4-5",
max_tokens=1024,
tools=[tool],
tool_choice={"type": "tool", "name": "extract_review"}, # force this tool
messages=[
{
"role": "user",
"content": f"Extract structured review data:\n\n{review_text}"
}
]
)
# The tool_use block contains the structured data
tool_use_block = next(b for b in response.content if b.type == "tool_use")
return ProductReview(**tool_use_block.input)
review = """
Absolutely love this laptop! The battery life is incredible — 12+ hours easily.
The keyboard feels great and the display is stunning. Only complaint is the webcam
quality is mediocre for the price. Would definitely recommend.
"""
result = extract_review_anthropic(review)
print(f"Sentiment: {result.sentiment}") # positive
print(f"Rating: {result.rating}/5") # 4
print(f"Positives: {result.key_positives}") # ["battery life", ...]
Handling parse errors and retries
Even with structured output enforcement, you should build resilient parsing that handles edge cases gracefully. The parse() helper may return None for the parsed field if the model refused to respond. For raw JSON parsing, always wrap in try/except.
import json
from openai import OpenAI
from pydantic import BaseModel, ValidationError
client = OpenAI()
def extract_with_retry(text: str, schema: type[BaseModel], max_retries: int = 2) -> BaseModel:
"""Extract structured data with retry on parse failure."""
last_error = None
for attempt in range(max_retries + 1):
try:
response = client.beta.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "Extract structured data. Be precise."},
{"role": "user", "content": text}
# On retry, we could append error context here
],
response_format=schema,
temperature=0
)
parsed = response.choices[0].message.parsed
if parsed is None:
raise ValueError("Model returned null — possible refusal")
return parsed
except (ValidationError, ValueError, json.JSONDecodeError) as e:
last_error = e
if attempt < max_retries:
print(f"Parse failed (attempt {attempt+1}): {e}. Retrying...")
continue
raise RuntimeError(f"Extraction failed after {max_retries+1} attempts: {last_error}")
Schema design tips
How you design your schema significantly affects the quality and reliability of extraction. Some hard-learned rules:
- Flat is better than nested. A schema with 10 top-level fields is easier for models to populate correctly than two levels of nesting. Avoid deep hierarchies.
- Use enums for controlled vocabularies. Instead of
"type": "string" for a status field, use "enum": ["pending", "active", "closed"]. This eliminates spelling variations and ambiguity.
- Descriptions are instructions. The
description field on each property is read by the model. Write clear, specific descriptions that resolve any ambiguity in what the field should contain.
- Mark everything required unless it truly is optional. Optional fields get skipped. If you need the data, make it required and let the model provide a best-effort value or null.
- Use specific types. Prefer
"type": "integer" over "type": "number" for counts, IDs, and years. Prefer "type": "boolean" over strings like "yes"/"no".
Structured outputs are the foundation of AI pipelines
Once you can reliably extract structured data from text, you can build document processing pipelines, automated research tools, data enrichment systems, and intelligent ETL workflows. The Pydantic + OpenAI structured outputs combination gives you a production-grade extraction pattern that is reliable enough for automated batch processing without human review.