Language models produce remarkably coherent outputs. Whether they “think” remains contested. The gap between appearance and reality tells us something important.
When a language model explains a concept, solves a problem, or generates creative text, it creates an output that resembles thought. But resemblance isn’t identity. Understanding the difference illuminates both what these systems do and what cognition might require.
the appearance of thought
Modern LLMs exhibit behaviors that, in humans, we’d attribute to thinking:
- They reason through multi-step problems
- They consider context and adjust responses
- They generate novel combinations of ideas
- They correct errors when prompted
- They explain their apparent reasoning
These behaviors emerge from pattern matching at scale—but dismissing them as “just” pattern matching may underestimate what pattern matching can achieve.
where the gap appears
The differences become visible at the edges:
- Consistency failures: The same model gives contradictory answers to equivalent questions
- Brittleness: Small input changes produce dramatically different outputs
- Confabulation: Models generate plausible-sounding but fabricated information with high confidence
- No persistent state: Each conversation starts fresh; learning requires retraining
what the gap reveals
The gap between LLM behavior and human thought might not indicate that LLMs fall short of thinking—it might indicate that our concepts of “thinking” need refinement.
Human cognition also involves pattern matching, inconsistency, confabulation, and context-dependence. We just have additional mechanisms: embodiment, continuity of experience, social embedding, and perhaps something we don’t yet understand.
The question isn’t whether LLMs think like humans. It’s what minimal additional capabilities would be required to close the functional gap—and whether those capabilities emerge from scale, require new architectures, or demand something else entirely.