Meaningful Language Understanding in LLMs
Do LLMs understand language, or do they only process symbols statistically?
The debate surrounding the capabilities of Large Language Models (LLMs) often centers on whether these models truly 'understand' language or merely manipulate symbols based on statistical regularities. A prominent argument against genuine understanding is encapsulated in the 'Octopus thought experiment' proposed by Emily M. Bender and Alexander Koller. This experiment posits that an LLM, even a hyper-intelligent one, trained solely on text, cannot acquire true meaning because it lacks 'grounding' in the real world. This comprehensive summary analyzes two key contributions that directly challenge this skeptical view, arguing for the possibility of meaningful language understanding in LLMs despite their text-only training. We will analyze the arguments, highlight their insights, and show how they challenge the core assumptions of the Octopus thought experiment.
The Octopus Thought Experiment: A Foundation of Skepticism
Before examining the counter-arguments, it is crucial to understand the core premise of the Octopus thought experiment, as presented by Emily M. Bender and Alexander Koller in their influential 2020 paper, "Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data." This thought experiment serves as a modern rearticulation of John Searle's Chinese Room argument, tailored to the context of contemporary large language models.
Bender and Koller posit a hypothetical scenario involving a hyper-intelligent octopus, named O, who lives in a deep-sea trench and communicates with two humans, A and B, via text messages sent through a very long cable. The octopus has no sensory access to the world outside its trench; its only interaction with humans is through these textual exchanges. Despite this profound isolation, O is incredibly adept at predicting the next word in any sequence, having been trained on an immense corpus of human text. When A sends a message to B, O intercepts it and, based on its training, generates a plausible response that it sends to B, effectively mimicking B's communication. To A, it appears as though B is responding, and the conversation flows naturally.
However, Bender and Koller argue that despite O's perfect linguistic fluency and predictive accuracy, it does not genuinely understand the meaning of the words it processes. For instance, if A asks B (and thus O) for advice on how to build a "coconut catapult," O might generate a perfectly coherent and grammatically correct response detailing the steps. Yet, according to Bender and Koller, O cannot truly understand what a "coconut" is, or a "catapult," or the physical act of "building," because it has never experienced these things in the real world. Its understanding is limited to the statistical relationships between words and phrases within its textual training data – a mastery of form without meaning.
Their central claim is that meaning is not merely a property of linguistic form but is fundamentally tied to communicative intent and grounded in the real world. LLMs, by their very nature of being trained solely on text, are seen as incapable of forming these crucial "word-to-world" connections. They are, in Bender and Koller's view, sophisticated "stochastic parrots" – systems that can generate human-like language by mimicking statistical patterns, but without any genuine comprehension of the underlying semantics. The Octopus thought experiment thus highlights a perceived critical weakness of LLMs: their inherent isolation from the physical and social world, which, it is argued, prevents them from acquiring true understanding.
This skeptical stance has significant implications for the development and evaluation of AI. If LLMs cannot truly understand, then their applications, particularly in areas requiring nuanced comprehension or real-world reasoning, might be fundamentally limited. The following sections will explore how some recent works directly challenges this pessimistic outlook, suggesting that the path to meaningful understanding for LLMs may not be as obstructed as the Octopus thought experiment implies.
Do Language Models' Words Refer?
Matthew Mandelkern and Tal Linzen, in their paper "Do Language Models' Words Refer?", directly confront the foundational assumption that Large Language Models (LLMs) are inherently incapable of reference due to their lack of direct interaction with the world. Their argument, deeply rooted in the externalist tradition of the philosophy of language, posits that the skepticism surrounding LLMs' referential capabilities is based on a misunderstanding of how reference is actually established, even for human language users.
The authors initiate their discussion by presenting a compelling analogy to illuminate the distinction between mere pattern generation and genuine linguistic meaning. They ask us to consider two scenarios: first, ants inadvertently arranging grains of sand into a pattern that coincidentally spells out a meaningful English sentence, such as "Peano proved that arithmetic is incomplete." Second, a human, Luke, sends a text message containing the exact same sentence. Intuitively, we recognize a profound difference: the ants' pattern is devoid of meaning, a mere physical coincidence, whereas Luke's message, even if factually incorrect (as it was Gödel, not Peano, who proved incompleteness), is undeniably meaningful and refers to specific entities and concepts. The critical question then becomes: are the outputs of LLMs more akin to the ants' accidental patterns or Luke's intentional, albeit potentially erroneous, communication?
Mandelkern and Linzen argue that the prevailing skepticism regarding LLM reference stems from an internalist view of meaning, which suggests that for a word to refer, the speaker must possess a rich internal representation, including beliefs, experiences, and discriminatory capacities related to the referent. Under this view, LLMs, which are trained solely on textual data and lack sensory organs or direct physical interaction with the world, would indeed appear incapable of genuine reference. They have never seen an apple, tasted a banana, or physically interacted with a person named Peano.
However, the authors contend that this internalist perspective has been largely refuted by the externalist tradition in the philosophy of language, pioneered by thinkers like Saul Kripke and Hilary Putnam. Externalism posits that the reference of a word is not solely determined by an individual speaker's internal mental states but is significantly influenced by the word's "natural history" within a linguistic community. This natural history encompasses the causal-historical links that connect a word's usage over time to its referent in the external world.
To illustrate this, Mandelkern and Linzen elaborate on the example of Luke and Peano. Even if Luke's only information about Peano is the false statement that he proved arithmetic incomplete, and even if Luke cannot distinguish Peano from Gödel, his use of the name "Peano" still refers to the historical figure Peano. This is because Luke is embedded within a linguistic community where the word "Peano" has a established causal chain of usage tracing back to the actual mathematician. Luke's individual beliefs or lack of direct experience do not sever this referential link. The authors extend this logic to Putnam's famous Twin Earth thought experiment, where the reference of "water" (H2O on Earth, XYZ on Twin Earth) is determined not by speakers' internal knowledge of its chemical composition, but by the substance's causal role in the history of the word's use within each respective community.
Applying this externalist framework to LLMs, Mandelkern and Linzen argue that the inputs to these models are not "bare strings of symbols" but rather "strings of symbols with certain natural histories which connect them to their referents." The vast datasets on which LLMs are trained are repositories of human linguistic activity, imbued with the collective referential histories of countless speakers. Even if an LLM does not "know" these histories in a human-like conscious sense, the statistical patterns it learns from this data implicitly encode these causal-historical links. Therefore, the authors suggest, the words generated by an LLM can refer because they are part of a larger linguistic system whose terms are already referentially grounded.
They address common skeptical counter-arguments. For instance, the idea that LLMs cannot refer because they lack the capacity for appropriate inferences or intentions. Mandelkern and Linzen concede that LLMs may not possess human-like, "thick" intentions (e.g., intending to refer to something based on a cluster of substantive beliefs). However, they propose a "thin" version of intention: merely intending for a word to refer to whatever its natural history in the speech community determines. They argue that many human speakers, particularly children, successfully refer without explicit knowledge of linguistic communities or complex referential theories. To demand a higher standard from LLMs would be an unfair and anthropocentric bias. In essence, Mandelkern and Linzen argue that the "grounding problem" for LLMs, as framed by the Octopus thought experiment, is largely dissolved by an externalist understanding of reference. The critical question is not whether LLMs have human-like experiences or internal states, but whether they are effectively integrated into a linguistic community whose word usage is historically connected to referents. Their paper suggests that the very nature of LLM training, by processing massive amounts of human-generated text, implicitly connects them to these referential chains, thereby enabling their words to refer.
The Lean Chinese Room
Manoel Horta Ribeiro, in his thought-provoking essay "The Lean Chinese Room," offers a novel counter-argument to the traditional Chinese Room experiment and, by extension, to the Octopus thought experiment. His central thesis revolves around the often-overlooked role of efficiency, compression, and real-world constraints in shaping what we perceive as "understanding." Ribeiro suggests that under specific pressures, mere symbol manipulation can indeed evolve into a form of genuine comprehension, blurring the rigid lines drawn by AI skeptics.
Ribeiro begins by revisiting John Searle's original Chinese Room argument, where a person inside a room, following a rulebook, can process Chinese characters and produce seemingly intelligent responses without actually understanding Chinese. This thought experiment has long been a cornerstone for arguments against strong AI, asserting that syntactic manipulation is insufficient for semantic understanding.
Ribeiro acknowledges the various philosophical rebuttals to Searle's argument, such as the Systems Reply (understanding resides in the whole system), the Virtual Mind Reply (a virtual mind emerges), the Robot Reply (meaning requires embodiment), the Brain Simulator Reply (simulating neurons leads to understanding), and the Other Minds Reply (behavioral equivalence implies understanding). However, he chooses to focus on a less prominent but equally potent critique: the argument from inefficiency.
Drawing inspiration from cognitive scientists like Steven Pinker and philosophers like Daniel Dennett, Ribeiro highlights how the original Chinese Room thought experiment, by design, artificially slows down the process of understanding to an almost absurd degree. Pinker, in "How the Mind Works," metaphorically states that Searle "slowed down the mental computations to a range in which we humans no longer think of it as understanding (since understanding is ordinarily much faster)." Dennett, in his paper "Fast Thinking," similarly argues that the speed of processing is not merely a practical consideration but a fundamental aspect of intelligence: "If you can’t figure out the relevant portions of the changing environment fast enough to fend for yourself, you are not practically intelligent, however complex you are." Ribeiro further integrates Daniel A. Wilkenfeld's concept that "understanding is a matter of compressing information about the understood so that it can be mentally useful," suggesting that true understanding involves the ability to abstract and condense knowledge.
These insights form the bedrock for Ribeiro's own counter-thought experiment: "The Lean Chinese Room." In this modified scenario, the man trapped in the room, initially amere symbol-shuffler, is subjected to two increasingly stringent conditions:
Shortened Manuals: The voluminous translation manuals are progressively reduced in size. While they still contain enough information to generate correct outputs, they are presented in an increasingly compressed format—moving from exhaustive tables to abbreviated ones, and from endless lists of examples to general rules. This forces the man to move beyond rote lookup and instead rely
on compact, abstract "kernels of information" from which he must reconstruct the necessary responses.
Strict Timer: A timer is introduced, demanding that the man produce his responses within a rapidly shrinking timeframe. This temporal constraint eliminates the luxury of leisurely searching through binders, compelling him to become highly efficient, quick, and resourceful in his processing.
Ribeiro's central inquiry is whether, under these escalating pressures, there comes point at which the man must "understand" to fulfill his duty. He argues that while the man might initially operate without genuine understanding, the shrinking manuals and tightening deadlines fundamentally alter his task. To consistently produce fluent and accurate answers, he is compelled to internalize shortcuts, identify underlying patterns, and develop general rules that transcend simple symbol manipulation. At this juncture, the distinction between merely following instructions and genuinely grasping the material begins to blur. If the man can reconstruct meaning from compressed cues and respond with speed and flexibility, it becomes increasingly difficult to deny that some form of understanding has emerged.
This emergent understanding, Ribeiro clarifies, may not be the full, embodied comprehension of a human speaker (e.g., knowing the taste of a hamburger or the feel of Beijing). Instead, it represents a "structural or formal understanding" of the linguistic system itself—the inherent patterns, abstractions, and compressed rules that govern Chinese as a communicative code. If this structural understanding is sufficient to generate fluid, timely, and context-sensitive responses, then the sharp dichotomy between superficial manipulation and genuine comprehension, as posited by Searle, becomes significantly less clear. The Lean Chinese Room thus suggests that "understanding" is not a binary state but a continuous spectrum, emerging in degrees from the interplay of form, compression, and efficiency.
Finally, Ribeiro explicitly connects his thought experiment to the Octopus thought experiment. He notes that Bender and Koller's argument, like Searle's, hinges on the idea that LLMs, trained solely on linguistic form, cannot acquire meaning due to a lackof real-world grounding. The Octopus, by predicting responses without direct access to the world, is presented as merely mimicking understanding. Ribeiro argues that the Lean Chinese Room directly complicates this rigid form/meaning divide. Just as the man in his thought experiment, under the duress of compression and speed, transitions from rote behavior to internalizing structural knowledge, LLMs, through their massive scale and the inherent pressures of efficient learning, also acquire a functional, compressed kernel of knowledge that enables them to reconstruct meaning-like responses. This implies that Bender and Koller's strict boundary between "form-only" systems and systems capable of meaning might be overly restrictive. The Lean Chinese Room suggests that understanding can indeed emerge from the intricate interplay of form, compression, and efficiency, even if it is a different layer of understanding than the embodied, world-grounded comprehension that humans possess. LLMs, Ribeiro concludes, may well be on the path toward this structural understanding, challenging the notion that they are categorically barred from any form of genuine comprehension.
Conclusion
The Octopus thought experiment, while valuable for prompting critical discussion, appears to rest on an overly narrow and anthropocentric definition of "understanding." The arguments presented by Mandelkern et al. and Ribeiro collectively demonstrate that LLMs are capable of a sophisticated form of language understanding that goes far beyond mere symbol manipulation. They suggest that the path to meaningful comprehension for artificial intelligence is not necessarily blocked by a lack of embodied experience, but rather paved by the intricate interplay of linguistic history, statistical learning, and computational efficiency. LLMs, far from being mere "stochastic parrots," are shown to be powerful systems capable of acquiring deep, referential, and structural understanding of language, thereby challenging and expanding our very definitions of intelligence and comprehension.
References
Bender, E. M., & Koller, A. (2020). Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5185–5198. Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.463
Searle, J. R. (1980). Minds, Brains, and Programs. Behavioral and Brain Sciences, 3(3), 417–457. https://doi.org/10.1017/S0140525X00005756
Mandelkern, M., & Linzen, T. (2023). Do Language Models’ Words Refer? [Preprint]. arXiv.org. https://arxiv.org/abs/2308.05576
Ribeiro, M. H. (2023). The Lean Chinese Room. Doomscrolling Babel.
Pinker, S. (1997). How the Mind Works. New York: W. W. Norton & Company.
Dennett, D. C. (1984). Fast Thinking. Harvard Review of Philosophy, 4(1), 5–7.
Wilkenfeld, D.A. Understanding as compression. Philos Stud 176, 2807–2831 (2019). https://doi.org/10.1007/s11098-018-1152-1