Structured Output from LLMs: JSON Mode Explained

LLMs generate text. They predict the next token, then the next, until they produce a response. That’s great for chat. It’s a problem when your application needs data it can actually use: a JSON object with specific fields, a list of extracted entities, or typed parameters for a function call. Free-form text doesn’t parse. You need structure.

Structured output solves this. It constrains the model to produce output your code can consume directly. No regex scraping. No brittle string parsing. Just valid, typed data.

JSON Mode: The Simplest Constraint

JSON mode tells the model to output valid JSON and nothing else. No markdown code fences, no explanatory text, no trailing commas. Just raw JSON that parses on first try.

OpenAI, Anthropic, Google, and most other providers support it. In the OpenAI API, you set response_format: { type: "json_object" }. The model is instructed internally to produce only JSON. Same idea across providers, slightly different parameter names.

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Extract name and email from: John Doe, john@example.com"}],
    response_format={"type": "json_object"}
)
# response.choices[0].message.content is guaranteed parseable JSON

JSON mode guarantees parseable output. It does not guarantee a specific shape. The model might return {"name": "John Doe", "email": "john@example.com"} or {"person": {"name": "John Doe", "email": "john@example.com"}}. You still need to describe the structure in your prompt and validate on the client.

Function Calling and Tool Use

Function calling (OpenAI) and tool use (Anthropic) go further. Instead of asking for arbitrary JSON, you define functions with typed parameters. The model outputs structured arguments for those functions. Your code receives a parsed object that matches your schema.

This is how models “use tools.” You define a function like search_database(query: string) or send_email(to: string, subject: string, body: string). The model decides when to call it and fills in the parameters. The API returns a structured object you pass directly to your implementation.

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City name"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location"]
        }
    }
}]

The model returns something like {"location": "San Francisco", "unit": "celsius"} when it decides to call get_weather. You parse it, validate it, and execute your actual weather API. The structure is enforced by the schema you provided.

Schema Enforcement: Strict Output Shapes

Some APIs let you enforce a full JSON Schema. OpenAI’s structured outputs (via response_format with a schema) and Anthropic’s tool use with strict schemas guarantee that the output conforms to your definition. Wrong field names, wrong types, missing required fields: the API rejects the output and retries or errors.

This is the strongest guarantee. You define exactly what you want:

{
  "type": "object",
  "properties": {
    "sentiment": {"type": "string", "enum": ["positive", "negative", "neutral"]},
    "confidence": {"type": "number", "minimum": 0, "maximum": 1},
    "summary": {"type": "string"}
  },
  "required": ["sentiment", "confidence", "summary"]
}

The model cannot return "sentiment": "happy" if “happy” isn’t in the enum. It cannot omit confidence. The API validates before returning. Use this when you need strict contracts: data pipelines, API integrations, or any place where downstream code assumes a fixed shape.

Client-Side Validation: Your Safety Net

Even with schema enforcement, validate on the client. APIs can change. Models can occasionally produce malformed output. Network issues can truncate responses. Defensive parsing catches problems before they reach your business logic.

Pydantic (Python) and Zod (TypeScript) are the standard choices. Both parse JSON into typed objects and validate structure. Invalid data raises an error instead of propagating garbage.

from pydantic import BaseModel

class ExtractionResult(BaseModel):
    name: str
    email: str

result = ExtractionResult.model_validate_json(model_output)

import { z } from "zod";

const ExtractionResult = z.object({
  name: z.string(),
  email: z.string().email()
});

const result = ExtractionResult.parse(JSON.parse(modelOutput));

If the model returns {"name": "John", "email": "not-an-email"}, Zod or Pydantic rejects it. You catch the error, log it, retry, or fall back to a default. Never assume the model output is correct without validation.

When to Use Structured Output

API integrations. Your LLM extracts data from documents to populate a CRM, ticketing system, or database. The output must match the target schema. Structured output plus validation ensures it does.

Data extraction. Pull entities, classifications, or key-value pairs from unstructured text. RAG retrieves the right chunks; structured output extracts the right fields. The result feeds into search, filters, or downstream analytics.

Classification and routing. Sentiment, intent, category, priority. A fixed set of labels. Low temperature plus an enum schema gives you consistent, parseable classifications for routing logic.

Multi-step pipelines. Step 1: extract. Step 2: transform. Step 3: store. When each step’s output is the next step’s input, structured data is non-negotiable. Free-form text breaks the pipeline.

Function calling. The model decides which tool to use and with what arguments. Structured parameters are the interface. Without them, you’re back to parsing strings.

Practical Tips

Keep schemas simple. Complex nested structures increase failure rates. Flatten when possible. Prefer a few well-named fields over a deep hierarchy. The model has to fit its output to your schema; simpler schemas are easier to satisfy.

Provide examples. In your system prompt, show the exact JSON shape you want. One or two examples dramatically improve consistency. “Return JSON like: {“name”: ”…”, “email”: ”…”}” works better than “Return name and email as JSON.”

Handle validation failures gracefully. When Pydantic or Zod throws, don’t crash. Log the raw output and the error. Retry with a clearer prompt, fall back to a default, or surface the failure to the user. Production systems need fallbacks.

Use low temperature for structured tasks. Temperature controls randomness. For extraction, classification, and any task where you need consistent structure, set temperature to 0 or 0.2. High temperature increases the chance of malformed or inconsistent output.

Start with JSON mode, upgrade as needed. If a simple “return JSON” instruction plus client validation works, use it. Add schema enforcement when you need stricter guarantees. Add function calling when the model needs to choose and invoke tools. Each layer adds complexity; use the minimum that solves your problem.

Structured output turns LLMs from text generators into data producers. JSON mode, function calling, and schema enforcement give you the control. Client-side validation gives you the safety. Together they make LLM output reliable enough for production systems. For more on building AI applications that work in the real world, see Get Insanely Good at AI.

Structured Output from LLMs: JSON Mode Explained

JSON Mode: The Simplest Constraint

Function Calling and Tool Use

Schema Enforcement: Strict Output Shapes

Client-Side Validation: Your Safety Net

When to Use Structured Output

Practical Tips

Keep Reading

Apple's iOS 27 Ships Generative Extend and Spatial Reframing

What Are Parameters in AI Models?

What Is AI Inference and How Does It Work?

Continued Pretraining vs RAG: Two Ways to Add Knowledge

GPT vs Claude vs Gemini: Which AI Model Should You Use?