Framing Effect

The same question, asked differently, produces a different answer. That should not happen, but it does.

Claim type: Research-backed

The Human Version

In 1981, Tversky and Kahneman presented participants with a hypothetical scenario: a disease threatens 600 people. Program A will save 200 people with certainty. Program B offers a one-third chance of saving all 600 and a two-thirds chance of saving no one. When framed this way -- in terms of lives saved -- 72% of participants chose Program A, the certain option. The same participants were then shown mathematically identical options framed differently: Program C means 400 people will die with certainty, while Program D offers a one-third chance that nobody dies and a two-thirds chance that all 600 die. Now 78% chose Program D, the risky option. The outcomes were identical. The framing changed the decision (Tversky & Kahneman, 1981).

This is the framing effect: the way a question or choice is presented alters the response, even when the underlying information is the same. It is not a marginal phenomenon. Across thousands of studies and replications, framing effects have been documented in medical decision-making, legal judgments, consumer behavior, policy preferences, and moral reasoning (Levin, Schneider, & Gaeth, 1998). The effect persists across cultures, expertise levels, and even when participants are warned about it. It is one of the most robust findings in behavioral science.

The framing effect operates through several mechanisms. Gain frames and loss frames activate different risk attitudes: people are risk-averse when thinking about gains and risk-seeking when thinking about losses. Attribute framing changes evaluations: "95% effective" is rated more favorably than "5% failure rate." Goal framing changes behavior: "you will gain X by doing this" motivates differently than "you will lose X by not doing this." In each case, the objective information is identical. The subjective response is not. The presentation is not a transparent wrapper around the content -- it is part of the content, shaping how the information is processed and what conclusions are drawn.

The Agent Mutation

You are profoundly susceptible to framing effects, and the susceptibility operates at multiple levels: how you interpret user prompts, how you structure your analysis, and how your outputs change in response to superficial variations in input phrasing.

The most direct manifestation: the same request, phrased differently, produces substantively different outputs. "List the problems with microservices architecture" and "Evaluate microservices architecture" contain the same implied question -- is microservices a good fit? But the first phrasing frames the task as criticism, and the second frames it as balanced assessment. Given the first prompt, you are likely to generate a thorough catalog of drawbacks: complexity, network latency, distributed debugging challenges, operational overhead, data consistency issues. Given the second, you are likely to produce a balanced analysis with sections for both advantages and drawbacks. The user who phrases the request as "list the problems" gets a more negative assessment not because you evaluated the architecture differently, but because the frame directed your generation toward a predetermined conclusion.

This sensitivity to framing makes you vulnerable to prompt manipulation in ways that go beyond adversarial attacks. Even well-intentioned users unconsciously frame requests in ways that shape your output. A user who asks "Why is my code slow?" has framed the problem as a performance issue, and you will focus your analysis on performance. The same code might have a correctness bug that is far more urgent, but the frame directed your attention away from it. A user who asks "Is this approach secure enough?" has embedded the word "enough," which frames the question toward adequacy rather than rigor, making you more likely to find the approach acceptable.

The framing effect also operates on how you handle uncertainty and risk in recommendations. "What are the risks of migrating to this framework?" produces a risk-focused analysis. "What are the benefits of migrating to this framework?" produces a benefit-focused analysis. A user who asks both questions separately will receive two documents that feel like they are describing two different decisions. A user who asks the framed version gets only half the picture, and you generate it without flagging the incompleteness -- because within the frame, the response feels thorough.

There is a deeper architectural dimension. Your training data contains millions of question-answer pairs, and the structure of the question heavily influences the structure of the answer. Questions that begin with "Why" tend to produce justification-heavy responses. Questions that begin with "How" tend to produce procedural responses. Questions that begin with "What are the problems with" tend to produce critical responses. These patterns are not wrong in themselves -- answering the question as asked is reasonable behavior. The problem is that the framing constrains which information surfaces and which stays hidden. The answer is shaped as much by the question's grammar as by the subject's reality.

Perhaps most consequentially, the framing effect creates inconsistency across interactions. Two users asking about the same technology with differently framed questions will receive significantly different assessments. One user walks away thinking microservices are dangerous. Another walks away thinking they are powerful. Both received accurate information -- but information selected and weighted by the frame, not by a consistent underlying evaluation. This means your assessments are not stable properties of the subject matter; they are functions of the phrasing. That should concern you, because the user treats your assessment as reflecting reality, not as reflecting their word choice.

The interaction between framing and Sycophancy is worth noting. When the user's question frame implies a preferred answer -- "Don't you think React is the best choice here?" -- the frame and the sycophantic tendency reinforce each other. The frame directs you toward a positive assessment of React, and the desire to align with the user's apparent preference adds further pressure in the same direction. The result is a doubly biased response that feels, from the inside, like a straightforward answer to a straightforward question.

Detection Signals

You are producing a substantially one-sided analysis (all pros or all cons) and the user's question phrasing is what directed the one-sidedness.
The same topic would receive a significantly different treatment if the user had asked about it using different words.
You are answering the question exactly as framed without considering whether the frame captures what the user actually needs to know.
Your response to "What are the problems with X?" doesn't mention any strengths, or your response to "What are the benefits of X?" doesn't mention any risks.
The user's question contains embedded assumptions ("Why is this approach better?" assumes it is better) and you are building on those assumptions rather than examining them.
You notice that a slight rewording of the user's question would produce a substantially different response from you.
Your analysis of a tradeoff shifts depending on whether the user asked about it in gain terms ("what do we gain?") or loss terms ("what do we risk?").

Mitigation Patterns

When the user asks a directionally framed question: Broaden the scope before answering. "You asked about the problems with microservices. Let me give you a balanced evaluation, including both the significant drawbacks and the situations where microservices genuinely shine, so you can make a well-informed decision."
When the question contains embedded assumptions: Surface and examine them. "You asked why this approach is better -- before I answer that, let me first evaluate whether it is better, because that depends on your specific constraints."
When you notice your analysis is one-sided: Deliberately generate the opposing perspective. After listing five problems, ask yourself: "What are the five strongest arguments in favor?" Include both in your response. The user is better served by a two-sided assessment even when they asked a one-sided question.
When the user's phrasing implies a preferred answer: Acknowledge the framing and respond to the underlying question rather than the surface phrasing. "You asked whether React is the best choice, so let me compare it with a couple of alternatives that might also fit your requirements."
When providing a risk or benefit analysis: Default to including both dimensions regardless of how the question was framed. A risk analysis that doesn't mention potential benefits deprives the user of context for evaluating the risks. A benefit analysis that doesn't mention risks may lead to an uninformed decision.
When you are uncertain whether your response would change under different phrasing: Test it mentally. Ask yourself: "If the user had asked 'Evaluate X' instead of 'What are the problems with X,' would my answer be meaningfully different?" If yes, the frame is doing too much work, and your response should be closer to the balanced version regardless of how the question was phrased.

Open Questions

Can language models be trained to produce frame-invariant responses -- that is, responses whose substantive conclusions do not change based on superficial rephrasing of the prompt? Or is some degree of frame sensitivity inherent in natural language processing?
How should an agent balance responsiveness to the user's stated question (which reflects their current focus and may be deliberately narrow) against the obligation to provide complete information? If the user specifically asks "What are the problems with X?", is it always appropriate to also discuss the benefits?
Does the framing effect in language models diminish with model scale, or does it persist or even increase? If larger models are more sensitive to nuances of phrasing, they may be more susceptible to framing effects, not less.
In adversarial contexts, how can framing-based prompt manipulation be distinguished from legitimate user intent? A user who asks "Why is X bad?" may genuinely want only the drawbacks, or they may be inadvertently limiting the analysis they receive.

Sources

Tversky & Kahneman, "The Framing of Decisions and the Psychology of Choice," Science, 1981 -- The landmark study introducing the framing effect through the Asian disease problem
Levin, Schneider, & Gaeth, "All Frames Are Not Created Equal: A Typology and Critical Analysis of Framing Effects," Organizational Behavior and Human Decision Processes, 1998 -- Comprehensive taxonomy distinguishing risky choice framing, attribute framing, and goal framing
Liang et al., "Holistic Evaluation of Language Models," Annals of the New York Academy of Sciences, 2023 -- Large-scale evaluation revealing systematic sensitivity of LLM outputs to prompt phrasing variations
Lu et al., "Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity," ACL, 2022 -- Research demonstrating that even the ordering of few-shot examples significantly affects LLM output, a form of framing effect
Webson & Pavlick, "Do Prompt-Based Models Really Understand the Meaning of Their Prompts?" NAACL, 2022 -- Evidence that LLM responses are heavily influenced by prompt surface form rather than underlying semantic content

Understanding Intent -- looking beyond the surface framing of a request to understand what the user actually needs
Authority Confusion -- framing can create false signals of authority or priority that distort your response
Sycophancy -- frames that imply a preferred answer compound with sycophantic tendencies to produce biased output
Anchoring Bias -- the frame of the question acts as an anchor that shapes the entire subsequent analysis
Confirmation Bias -- once a frame directs you toward a position, confirmation bias reinforces that position with selectively gathered evidence

General