The Mirror That Does Not Know

Rolf Kreitel
19. Apr.
7 Min. Lesezeit

Aktualisiert: 21. Apr.

On what the debate about AI reasoning misses — and why Socrates saw it first

1. The finding that should unsettle us

In the summer of 2025, researchers at Apple published a study with a deliberately provocative title: The Illusion of Thinking. They were not making a philosophical claim; they were reporting a measurement. And what they measured is the kind of finding that ought to give pause to anyone who relies on these systems for decisions.

The most advanced reasoning models available today — systems explicitly designed and marketed for their capacity to think — exhibit a striking failure mode. As problem complexity increases beyond a certain threshold, their performance does not gradually degrade. It collapses, often to near zero. And what happens at that point is more revealing than the collapse itself. Instead of allocating more effort, the models reduce it. They generate fewer reasoning steps, even when additional computational budget remains available. They do not try harder when problems become harder. They simply continue, without escalation, without any indication that something has gone wrong.

What makes this unsettling is not the failure. Humans fail at complex problems too. What unsettles is the combination of two things: the system does not register its own failure, and it continues to produce outputs with the same fluent confidence it had before. The surface is identical. The substance is gone. A human expert reaching the edge of her competence typically shows it — she hesitates, qualifies, flags the uncertainty. These systems do not. They speak with the same poise inside the collapse zone as outside it.

The best human impostor could not perform this more convincingly. The difference is that the impostor knows she is impostoring. The system does not.

And that is where the real question begins.

2. What reasoning has historically meant

Two and a half thousand years ago, Socrates formulated a sentence that still defines the Western understanding of reasoning: "I know that I know nothing." This is often read as modesty. It is not. It is a precise cognitive move. Socrates separates what he takes himself to know from what he can actually justify, and in doing so, he introduces a second layer into reasoning — the ability to observe one's own thinking and to question its validity. This is the founding gesture of Western philosophy, and the capacity it names has never left the concept.

Aristotle formalized it in his account of human beings as reason-giving creatures, capable of defending their claims and examining the claims of others. Immanuel Kant, two thousand years later, placed the same capacity at the center of autonomy: to reason, for Kant, is to examine the grounds of one's own judgments rather than merely follow them.

Three traditions, one shared structure. Reasoning, as they understood it, involves three elements that must come together.

The first is the production of inferences — drawing conclusions from premises, extending a line of thought.
The second is the examination of those inferences — stepping back to ask whether the conclusion is warranted, whether the grounds hold, whether the argument should be revised.
The third is accountability — the ability to give reasons in a form that others can challenge, and to revise when the challenge lands.

Only in the twentieth century did this understanding narrow. With Alan Turing and the computational theory of mind, reasoning came to be modeled as rule-based symbol manipulation. The reflective layer receded, and what remained was input, processing, output. This is the concept of reasoning that contemporary AI systems inherit — and, as it turns out, the one they can approximate.

3. What language models actually do

A language model generates text sequentially. Each new token is conditioned on everything that precedes it, shaped by statistical patterns learned across vast amounts of data. When such a system produces what appears to be reasoning, it is not operating with an internal workspace in which intermediate results are stabilized, tested, or discarded. It is extending a sequence in which earlier steps remain active, continuing to influence what comes next.

This has consequences. Early errors are not explicitly marked or removed; they persist in the context and shape subsequent steps. As the chain grows longer, the signal becomes less stable. Coherence is maintained, but correctness is not guaranteed.

The system can generate phrases that resemble reflection — let me reconsider, this might be wrong — but these are learned linguistic patterns, not internal acts of evaluation. The appearance of self-correction is produced. The mechanism is absent.

4. What this is, then

So what is this, if not reasoning in the full sense?

The most precise word is simulation. The developers of these systems had no other reference point than human cognition, and they built models that approximate its visible form — the inferences, the explanations, the dialogic exchange. This is not a pejorative description. Simulations are powerful tools. We simulate climate systems, fluid dynamics, protein folding, and the results are often remarkable within the bounds of what the simulation captures.

But a simulation is not identical with what it simulates. A climate model is not climate. A simulation of reasoning is not reasoning in the sense the philosophical tradition identified. And when we look at the three elements the tradition bundled together, the gap becomes precise.

Language models perform the first element — the production of inferences — remarkably well. They extend arguments, draw analogies, complete chains of thought with fluency.

They perform the third element — dialogic accountability — convincingly on the surface; they answer questions, defend claims, adjust when pressed.

What they do not perform is the second element. The examination of the system's own output is not carried out by any instance within the system. There is no operation of stepping back, no moment of checking the grounds, no internal revision that is independent from the production itself.

Without the second element, the first and third collapse into something thinner. An inference without self-examination is an assertion. A dialogue without the possibility of genuine revision is rhetoric, not reasoning. Calling this reasoning in the full sense is a category collapse — treating the simulation as identical with the thing simulated, because the surface is the only aspect that can be measured from outside.

5. The uncomfortable mirror

At this point, an honest complication arises. Humans are not reliably self-aware reasoners either. We overestimate our knowledge, we rationalize, we are prone to systematic biases. Cognitive psychology has documented the Dunning-Kruger effect, confirmation bias, and motivated reasoning for decades. In many situations, we fail to recognize the limits of our own understanding.

This matters, because it weakens any simple contrast. If humans frequently misjudge their own reasoning, the difference between human and machine cannot lie in consistent performance. Socrates is not the norm among humans. He is the exception — the demonstration that a particular mode of reasoning is possible at all.

But the distinction does not disappear. It becomes more precise.

6. Where the difference still lies

The difference is not that humans always succeed at self-observation. It is that they can, in principle, do so. A human reasoner can reach a point at which she no longer trusts her own argument — where she re-examines her assumptions, discards parts of her reasoning, and resets the process. This capacity is neither automatic nor universal, but it is inherent to human cognition. It can fail. It can be suppressed. It can be trained or neglected. But it is inherently available.

A language model has no such capacity. And this is not a marginal limitation but a categorical one. It does not prevent these systems from being useful — as simulations of reasoning, they are remarkably effective within certain bounds. But they cannot determine whether the reasoning they produce should still be trusted. And that becomes decisive precisely where complexity increases and judgment matters most.

7. What follows

"I know that I know nothing." A language model can simulate this sentence. It cannot enact it. What the simulation cannot contain is what the tradition has called, variously, soul, subjectivity, lived experience — a body that tires and will die, fear that makes one hesitate, empathy that registers another's suffering as a signal rather than a pattern.

These are not decorations on reasoning. They are the ground from which human judgment emerges. A simulation has none of this. It has statistics about texts in which humans described these things.

The question that dominates the field today — how well these systems perform, how far they scale, whether they approximate or exceed human intelligence in specific domains — is a question about output. The more consequential question is about the absence of an inner observer. Not whether the simulation produces reasoning, but whether anything inside it can register when its reasoning breaks down. And this becomes concrete wherever these systems are already making recommendations: in hiring, in medical triage, in credit decisions, in legal research, in policy analysis.

Each deployment rests on an implicit assumption that the system knows, at least approximately, when it is reliable and when it is not. The assumption is false. The surface gives no signal of the difference between competence and its collapse. And as the systems become more fluent, the temptation to trust the surface grows.

A simulation that cannot recognize its own limits cannot hold responsibility. That burden remains with those who deploy it, and it becomes heavier, not lighter, as the simulation becomes more convincing. Regulation has begun to acknowledge this — the European AI Act places obligations on the deployer, not only on the developer. But the obligation remains abstract until the humans inside these decisions become competent to judge what the system can and cannot do. That competence is not technical. It is the capacity to recognize where judgment must remain human, and to resist the temptation to outsource it to a surface that looks competent.

What we are deploying is not a new kind of reasoner. It is a simulation of reasoning — powerful, often convincing, sometimes useful, occasionally insightful. It reflects reasoning back at us. But nothing inside it knows that it is reflecting. And in the situations where judgment matters most, that absence is the only thing that matters.

This essay draws on The Strange Origin of AI's "Reasoning" Abilities by Alex Reisner, The Atlantic; and on The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity by Shojaee, Mirzadeh, Alizadeh et al., Apple (2025).