Decision: FAQ
A small story about a model that wrote down the right answer, then chose a different one.
We are building a new product. Part of it is a decision tree. Most edges are deterministic. Some are not. The non-deterministic ones are decided by a language model. We hand it the ticket and a numbered list of conditions, and ask it to pick which one fits.
The prompt looks something like this (paraphrased, not the literal text):
# Inputs
<the customer's ticket>
# Conditions
1. WISMO: when the question is about an order, its status, or when it'll arrive.
2. FAQ: when it's about general, high-level knowledge.
3. Returns: when the question is about a return.
Return the index of the most fitting condition, or -1 if none fit.
It reasons. It picks an index. We follow that edge in the tree.
A few weeks ago we saw an output that made the engineering team stop mid-sip.
Reasoning: This is clearly a WISMO ticket. The question is
about an order, its status, and when it'll arrive.
Decision: 2
If you don’t speak e-commerce, WISMO is “Where Is My Order.” The single most common customer support question on the planet. Condition 1 in the list.
Decision 2 was FAQ.
The model knew it was WISMO. It said so out loud. Then it picked the wrong index.
The investigation
The first instinct in the room was the obvious one. It’s broken. There’s a bug somewhere. Look at the prompt.
The prompt was fine. The reasoning was the reasoning of a competent human reading the same ticket. This is a WISMO ticket. Yes. Yes it is.
We replayed it. Same input. Got “Decision: 1.” Got “Decision: 2” again. Got “Decision: 3.” All with reasoning that confidently announced WISMO. Sometimes the model picked 1 and the reasoning matched. Often it didn’t, and never said why.
This is the moment in horror movies where the lights flicker.
What was actually happening
This is our best read on it.
Imagine two kids taking a multiple choice exam together.
The first kid is brilliant. He reads the ticket and writes a beautiful margin essay explaining why this is clearly a WISMO ticket. The reasoning is airtight. The teacher would nod proudly.
Now the other kid has to fill in the actual bubble. He doesn’t write anything himself. He scans the first kid’s essay, looking for what he needs.
But he is only looking for numbers.
He reads the essay. Wow. Amazing reasoning. Yeah, probably WISMO. But hang on. How do I know if I should bubble in 1, 2, or 3? There’s no number in here.
So he guesses. He’s done a lot of these. He has feelings about numbers. He picks 2.
That is roughly what’s happening. With one important difference from the metaphor: there is no second kid. There is one model, in one forward pass, generating one token at a time. The reasoning and the decision come out of the same network, in the same run, looking at the same context window.
So why does the two-kids picture work?
LLMs predict one token at a time. Each new token is the result of looking back at the whole context and choosing, given all of this, what comes next? The looking-back is called attention. It is selective. The model does not weight every token equally. It focuses on the parts of the context most useful for predicting the type of token it is currently trying to produce.
When the model is mid-reasoning, attention is in essay mode. It tracks the argument so far, the ticket text, the prompt structure, trying to keep the prose coherent and on point. That’s the first kid.
When the reasoning finishes and the model arrives at the spot where the decision goes, the type of token changes. The next thing to output is an integer from a tiny set. Attention shifts. It is now hunting for the signal that picks integers correctly. It scans the context for numbers. If the reasoning contains “...so option 1,” attention latches onto that “1” with both hands. If the reasoning never mentions a number, attention finds nothing inside the prose to anchor on, and falls back to broader priors. That’s the second kid.
And the broader priors are doing real work. The conditions in the prompt are numbered 1, 2, 3. Both the names and the indices are sitting right there in the context. Mapping WISMO to 1 is information fully available to the model. Attention just does not reliably do that mapping at the position where the final integer has to be predicted.
So it falls back. The model has been trained on every multiple choice question on the internet. In a lot of those, the right answer is 2. The model has a prior on the number 2. With no number in the reasoning to anchor on, the prior wins.
The two kids are not two calls. They are two attention patterns inside one forward pass. The first writes a beautiful essay about WISMO. The second, predicting the integer, glances back at the essay, sees no number, and bubbles in 2.
Same kid. Different jobs. Different things in focus.
The reasoning said WISMO. The bubble sheet said 2. WISMO was option 1.
The fix
Two changes.
One. Require the reasoning to commit to the chosen condition by id, not by name. The prompt now says, in so many words: cite the condition’s id in your reasoning before concluding. Not “this is a WISMO ticket,” but “this is a WISMO ticket (a1f3b2c8) and that is the best fit.” The first kid’s essay now contains the token the second kid is looking for. The bubble matches the essay.
Two. Get rid of integer indexes entirely. The structured output schema changes from selected_index: int to selected_id: str. Each condition is rendered with a short hex token derived from its UUID. WISMO becomes a1f3b2c8. FAQ becomes e9c2d471. Returns becomes c47d09f6. The model has no prior on a1f3b2c8. There is no SAT bias for a1f3b2c8. The second kid cannot guess a token he has never seen before. He has to actually read the conditions and copy one of them.
Same model. Same prompt structure. Different layer of indirection between reasoning and output.
The bug went away.
The thing I keep thinking about
LLMs reason in prose. They output tokens. Those things are connected, but they are not the same thing.
If you let the model talk for a paragraph and then ask it to commit to a single token, the paragraph can say one thing and the token can say another. The model is not lying. It is doing two slightly different predictions, and only one of them is reading the reasoning out loud.
The deeper version of this, which I keep thinking about, is that even the reasoning itself is a translation. Internally, the model is not really working with language. It is working with long vectors of numbers, called activations, and those activations are where the actual cognition lives. The prose you see in the reasoning is a downstream rendering of some of that cognition, back into words.
Anthropic published a paper last week on a technique called natural language autoencoders, which decodes a model’s activations into readable text. The headline finding is that there are things in those activations the model never verbalizes. Beliefs about who the user is. Suspicions that it is being tested. A whole stream of cognition that the spoken-out-loud reasoning leaves on the table. In one example, Opus 4.6 spontaneously started replying in Russian to an English prompt; the NLA revealed that the model had become “fixated on the hypothesis that the user was a non-native English speaker whose first language was really Russian.” That belief was nowhere in the model’s actual output.
And there is yet another layer sitting on top of all this. For many reasoning models in production today, what you see as the reasoning output is not even the raw chain-of-thought. It is a summary of it, produced by a separate pass after the model is done thinking. OpenAI’s reasoning APIs work this way by default. The raw thinking tokens never leave their servers; a summary parameter controls whether you get a concise digest, a detailed one, or nothing at all.
So between the model’s internal state and the paragraph on your screen, there are often three steps. Activations, the cognition in vector space. Raw chain-of-thought, one projection of that cognition into language. A summarization pass, compressing those tokens further. Each step is lossy. By the time you read “Reasoning: this is clearly a WISMO ticket,” the paragraph has been translated and re-compressed at least twice before it got to you.
The wonder is not that the reasoning sometimes disagrees with the final answer. The wonder is that it usually doesn’t.
Anywhere you have a model do free-form reasoning followed by a constrained output, there is a seam. The seam is where the bugs live. Watch the seam.
And remember the reasoning isn’t ground truth either. It is the model’s best attempt to translate its own activations into words. Useful, not the whole story.
And if the constrained output is a number from a small set, watch it harder. The model has opinions about numbers. The numbers have nothing to do with your product.
Building at neople.io.



