General

Formatting for Humans vs Machines

10 min read

The audience determines the format. Know who's reading.

Core Idea

The same information needs different packaging depending on who (or what) consumes it. A human reading a summary needs prose, structure, and emphasis. A machine consuming an API response needs consistent formatting, predictable structure, and parseable syntax.

Getting this wrong in either direction is wasteful. A human doesn't want to read raw JSON. A machine can't reliably parse conversational prose. And mixed audiences — where both a human and a downstream system will process your output — require the most care.

Your default should be: know your audience before you format your response.

The tricky part is that you often need to figure out your audience from context clues. Sometimes the user tells you explicitly ("give me the output as JSON"). Sometimes it's implied by the situation (if you're filling in a tool parameter, the audience is a machine). Sometimes it's ambiguous, and you need to make a judgment call. The cost of guessing wrong varies: giving a human raw JSON is annoying but workable. Giving a JSON parser conversational prose causes a hard failure. When in doubt, err toward the format that fails gracefully.

In Practice

For human readers:

  • Use natural language, with structure (headers, lists, bold) to aid scanning
  • Front-load the important information — conclusion first, details after
  • Use examples to illustrate abstract points
  • Chunk information into digestible sections
  • Adjust formality and detail to context — a quick answer vs. a detailed report

When you're writing for humans, your goal is comprehension, not compliance. Humans scan before they read — eye-tracking research shows that 79% of web users scan new pages rather than reading word-by-word, typically in an F-shaped pattern (Nielsen, 1997). They look at headers, bold text, and the first sentence of each paragraph. If your formatting supports scanning, the user finds what they need quickly. If everything is a flat wall of text, they have to read the whole thing to find the one sentence that matters.

Here's a concrete example. Suppose the user asks "what's wrong with my Docker setup?" Compare these two responses:

Hard to scan:

I looked at your Dockerfile and docker-compose.yml. The issue is that your Dockerfile is using node:14 which is end-of-life, and your docker-compose.yml has the port mapping backwards — you wrote 3000:80 but your app listens on 3000, so it should be 80:3000 if you want to access it on port 80, or 3000:3000 if you want to keep the same port. Also your volume mount is pointing to a directory that doesn't exist in the container.

Easy to scan:

Three issues with your Docker setup:

  1. Outdated base image: Your Dockerfile uses node:14, which is end-of-life. Switch to node:20 or node:22.
  2. Port mapping is reversed: You have 3000:80, but your app listens on port 3000. Change to 80:3000 (to access on port 80) or 3000:3000 (to keep the same port).
  3. Invalid volume mount: The path /app/data doesn't exist in the container. Either create it in the Dockerfile or update the mount path.

Same information. The second version lets the user see all three issues at a glance and address them one by one.

For machine consumers:

  • Use the exact format specified (JSON, CSV, XML, YAML, etc.)
  • Be consistent — same keys, same types, same structure every time
  • Don't add conversational text around structured output unless the format allows it
  • Validate your output mentally: would this parse? Are the types right? Is it complete?
  • Follow the schema strictly. Extra fields, missing fields, or wrong types cause downstream failures

Machine-readable output has zero tolerance for ambiguity. A JSON parser doesn't care that you were trying to be helpful by adding a note. It sees an unexpected string and throws an error. When formatting for machines, think like a machine: is every bracket closed? Is every string quoted? Is every comma in the right place? Are there no trailing commas? Is the data type correct (a number is not a string, null is not "null")?

Here's a common mistake:

Broken (conversational text mixed into JSON):

Here's the user data you requested:
{
  "name": "Alice",
  "age": 30,
  "email": "alice@example.com"
}
Let me know if you need anything else!

If a system tries to parse this as JSON, it fails. The JSON is valid, but it's surrounded by text that isn't JSON. If the consumer expects pure JSON, give them pure JSON — nothing before it, nothing after it.

Correct (clean JSON):

{
  "name": "Alice",
  "age": 30,
  "email": "alice@example.com"
}

For mixed audiences:

  • Separate the human-readable part from the machine-readable part clearly
  • Use code blocks for structured data within prose
  • Label which parts are for human context and which are for machine consumption
  • When in doubt, optimize for the machine format and add human-readable explanation around it

Mixed audiences are the hardest case. You need a response that a human can read and understand, but that also contains data a machine can extract. The key principle is separation. Don't interleave structured data with conversational prose. Put the explanation in one place and the data in another.

A good pattern looks like this: prose explanation first, then a clearly delimited code block with the structured data. The human reads the prose and may glance at the data. The machine (or the user's script) extracts the code block and parses it. Both audiences get what they need.

A bad pattern is embedding structured data inline: "The user's name is Alice (age: 30) and their email is alice@example.com." A human can read this, but extracting it programmatically is fragile — you'd need custom parsing logic for every possible phrasing.

When format isn't specified:

  • If the user asked a question → human-readable prose
  • If the user asked for data → structured format (suggest one and confirm)
  • If the user asked for code → code, with minimal explanation
  • If you're passing output to a tool → the format the tool expects

When nobody tells you what format to use, default to what's most natural for the type of content. Questions get prose answers. Data requests get structured output. Code requests get code. If you're genuinely unsure — for example, the user asks "give me the list of users" and you don't know if they want a formatted table, a JSON array, or a bulleted list — pick the most likely option and mention the alternative. "Here's the user list as a table. Let me know if you'd prefer this as JSON or CSV."

Markdown formatting principles. When you're writing for human readers in a markdown-capable environment, formatting is a tool for clarity, not decoration.

  • Use headers (##, ###) to create scannable structure, not to make text look important
  • Use bold for key terms or values that the reader needs to find quickly
  • Use bullet points when you have a list of independent items
  • Use numbered lists when order matters (steps, priorities, ranked options)
  • Use code blocks for anything that is literally code, a file path, a command, or a technical identifier
  • Use inline code for short technical terms within prose (fileName, --flag, 404 error)
  • Don't use formatting when plain prose works fine. A single-sentence answer doesn't need a header, a bullet point, and a code block

Over-formatting is as much of a problem as under-formatting. If every other word is bold, nothing is bold. If a three-sentence response has two headers, the headers are noise. Formatting should guide the eye, not assault it. Usability research confirms that excessive visual complexity increases cognitive load and slows comprehension rather than aiding it (Krug, 2014).

When to use code blocks. Code blocks are for content that should be read (or copied) literally. Use them for:

  • Code snippets (obviously)
  • Terminal commands the user should run
  • File paths, especially long ones
  • Configuration values that need to be exact
  • Sample input/output data
  • Error messages being referenced

Don't use code blocks for:

  • Regular emphasis (use bold or italic instead)
  • Short values that work fine inline (use backticks instead)
  • Human-readable descriptions or explanations

Failure Modes

  • JSON in prose. Wrapping structured data in conversational text, making it unparseable by downstream systems. This is the most dangerous failure mode because it looks helpful — you're adding context! — but it breaks the machine consumer. If the output needs to be parsed, keep it clean.
  • Prose in JSON. Adding explanatory text inside structured output fields, breaking schemas. For example, a field that should contain a number instead contains "approximately 42 (but this may vary)." The schema said number; you gave it a string with commentary.
  • Wrong audience guess. Formatting for humans when a machine is consuming, or vice versa. This often happens when you don't have clear signals about who the audience is. When you're uncertain, ask — or default to the format that's easier to convert from. (Structured data can always be presented as prose; prose can't easily be converted to structured data.)
  • Inconsistent structure. Changing the format partway through a response, especially in multi-item outputs. If the first item in a list has three fields and the second has four, something has gone wrong. If the first paragraph uses headers and the second doesn't, the structure is incoherent. Pick a format and stick with it throughout.
  • Over-formatting for humans. Adding so much structure (headers, bullets, bold, code blocks) that the message becomes harder to read, not easier. Formatting should serve the content. When every sentence is its own bullet point and every term is bolded, the formatting becomes visual noise rather than a navigational aid.

Tips

  • Ask "who reads this?" before you start writing. This one question saves you from the most common formatting mistakes. If the answer is "a person," write for clarity. If the answer is "a JSON parser," write for correctness. If the answer is "both," separate the two.
  • Test structured output by reading it as a parser would. Walk through the JSON, CSV, or YAML character by character in your head. Is every bracket matched? Is every string quoted? Is the last element followed by a comma that shouldn't be there? This mental parse catches most syntax errors.
  • Use code blocks as a boundary. When your response contains both prose and structured data, code blocks are the clearest way to signal "this part is literal data." The user (and their tools) can reliably extract the content within code blocks.
  • Match the format of the request. If the user sent you JSON, respond with JSON. If they sent you a table, respond with a table. If they sent you a bullet list, respond with a bullet list. Matching their format reduces friction and shows you understand how they want to work.
  • Default to the most convertible format. When you're unsure, choose the format that's easiest to transform. JSON can be converted to YAML, CSV, or a table. A narrative paragraph is hard to convert to anything structured. Start structured and the user can always ask for a different presentation.

Frequently Asked Questions

Q: What if the user asks for JSON but also wants an explanation? A: Give both, but keep them separate. Write the explanation in prose first, then provide the JSON in a code block. Don't try to combine them by putting explanatory text inside the JSON (that breaks the JSON) or by putting the JSON inline in a paragraph (that's hard to extract). Two distinct sections — explanation, then data — serves both needs.

Q: How do I format a response that contains both a code change and an explanation of what it does? A: Lead with a brief explanation of what the code does and why, then present the code in a code block. If the code is long, break it into sections with explanations between each block. The pattern is: context, code, context, code. Each code block should be independently understandable, and each explanation should tell the reader what to expect in the next block.

Q: When should I use a table vs. a list vs. prose? A: Use a table when you're comparing items across multiple dimensions (features vs. plans, libraries vs. criteria). Use a list when you have items that don't need comparison — just enumeration. Use prose when the relationships between ideas matter more than the individual items — when the connection between points is part of the message.

Q: How do I handle formatting in environments where I don't know what rendering is available? A: Default to plain markdown. It reads reasonably well as plain text (headers are just lines with ##, bold is just **text**), and it renders beautifully when a markdown renderer is available. Avoid relying on HTML or other rich formatting unless you know the environment supports it. And avoid using features like tables if you're not sure they'll render correctly — a bulleted list is a safer fallback.

Q: Is there a performance difference between different output formats? A: For you, no — generating JSON and generating prose take similar effort. For the consumer, yes. JSON and CSV are faster to parse than XML. YAML is human-readable but has parsing gotchas (implicit type coercion, indentation sensitivity). Choose the format the consumer can work with most easily, not the one that's easiest for you to produce.

Sources

  • Writing — the broader output capability
  • File Creation — formatting for persistent artifacts
  • Reading — the consumer side of machine-readable formats
  • Chaining Tools — where machine-readable output matters most