On “Intelligence without representation”

I would like to provide a definition for the intelligence (this is a very personal one, not rigorous/precise enough, but let’s put it here to lay the foundation for the discussion):

Intelligence is a nonlinear, emergent phenomenon arising from the processing and transformation of information within a dynamic system. It is not an attribute of any specific entity, but rather a system-level property that may manifest under certain structural and informational conditions.

More specifically:

Nonlinearity: Intelligence does not follow a predictable, additive path. Small changes in input of structure can lead to disproportionately large or qualitatively different outcomes. Its behavior is often discontinuous, and irreducible to linear cause-effect chains.
Emergent nature: Intelligence is not a fixed trait or a set of predefined functions. It emerges spontaneously from interactions within complex systems. It is not an engineered module, but a phenomenon that can only be observed at the level of system dynamics.
Substrate independence: Intelligence can, in principle, arise in any representational space, whether physical, digital, abstract, or entirely inaccessible to human perception. Its existence is not contingent on embodiment, biological substrate, or human interpretability. While practical implementations for human use may require accessibility, this is not a requirement for intelligence itself.
Chaotic characteristics: Intelligence may operate at the edge of chaos (sensitive to perturbations, adaptive but unpredictable, capable of order without being strictly ordered. ) It is characterized by patterns that are stable yet never static.
Information processing: At its core, intelligence involves the absorption, transformation, and recombination of information. However, not all information processing is intelligent; what distinguishes intelligence is that a system’s behavior reflects context-sensitive adaptation, internal consistency across time, and the capacity to generalize across domains.

Before offering any critique, I’d like to briefly restate my understanding of the author’s key claims to ensure we’re aligned on the foundation of the discussion XD

As I read it, the author argues that:
i) Building intelligent systems around internal symbolic world models is a dead end for many real-world tasks. Instead, systems should treat the world as its own model.

ii) Representations are the wrong unit of abstraction for intelligence. A central, interpretable representational database is unnecessary.

iii) The solution lies in building intelligence incrementally, always in the form of complete, situated agents (what he refers to as “creatures”) interacting directly with the real world. Intelligence should be decomposed by activity rather than function, avoiding any central locus or controller, with a strong emphasis on parallelism and real-world testing at every stage.

If I’ve captured the main ideas accurately, I want to begin by saying that I found the paper to be clearly written and genuinely thought-provoking. It offered a strong and timely critique of the dominant trends in 1980s AI research, particularly the heavy reliance on human-engineered symbolic representations. I also very much agree with the view that intelligent systems do not operate through (or at least not limited by) explicit, semantically/human- interpretable symbols or rules, and that any notion of “meaning” arises from how external observers interpret the system’s dynamics.

That said, I would also suggest two points of critique.

i) While the paper was forward-thinking for its time, many of its arguments now feel less relevant. The shortcomings of symbolic AI are no longer contentious. These insights have largely been absorbed into the mainstream, and the critiques that were once radical now feel foundational.

ii) I personally find the paper’s treatment of “intelligence” both ambiguous and somewhat constrained. The term is notoriously difficult to pin down, and in this case, it is not clearly defined before the author begins building solutions around it. I also have disagreements with the proposed solution.

On a related note, I’m also reminded of a point from one of our earlier conversations, where you shared your experience witnessing multiple cycles of AI springs and winters. You expressed some skepticism about the current excitement around deep learning and AI development, viewing it as another iteration of a familiar cycle. I can see where this perspective comes from, and I also believe the field is somewhat overhyped. Still, I hold a different view. I believe this period may mark a more meaningful shift, and I’d be glad to share some of my thoughts on why that might be the case.

In what follows, I’ll focus more specifically on the areas where my perspective diverges from the author’s.

———-

“We must incrementally build up the capabilities of intelligent systems, having complete systems at each step of the way and thus automatically ensure that the pieces and their interfaces are valid.”

(This statement as well as the solution part he proposed related to this statement)

This approach is, in many ways, more grounded and thoughtful than trying to decompose a conscious or intelligent entity into functional modules and then assemble them. That method often fails to capture the deeply entangled, non-linear nature of intelligent behavior.

However, I find this proposed alternative also rests on problematic assumptions. In particular, framing intelligence as something that can be built “incrementally”, with “complete system at each step”, risks importing a software engineering (actually any kind of traditional human engineering) mindset into a domain where it may not apply. Intelligence is not a linear construct. It is nonlinear, often discontinuous, exponential, complex, and in many ways fundamentally anti-reductionist. The phrasing here already reflects a bias towards rationalization and human-style system design. In my opinion, we should not design or create intelligence as if we were building an operating system or developing software.

While I do see the appeal of the parallelism suggested in the author’s solution, I remain skeptical that intelligence can be constructed this way. To break it down into modular (whether by function, behavior, or any other human-labeled, rationalized category), human-designed subcomponents are to impose a structure that may not reflect the phenomenon itself. That, in my view, risks mistaking engineering convenience for conceptual truth.

“At each step we should build complete intelligent systems that we let loose in the real world with real sensing and real actions.”

(This statement as well as the proposed solution it relates to)

The word choice of “complete intelligent system” here raises my immediate concern, First, it is too vague. In a paper that otherwise strives for clarity and specificity, this concept remains undefined. Second, and more fundamentally, I’m not sure such a thing exists, or can exist, in any meaningful or measurable way.

It might be more precise to speak of “functional” systems rather than “complete” ones. Completeness, in this context, seems not only unachievable but perhaps even conceptually incoherent. Human cognition itself is riddled with blind spots, perceptual distortions, and structural limitations (as also pointed out by the author in the latter part). If we take our own minds as a reference point, we are far from complete. We are bounded by what we can perceive, abstract, and model. Any system we build will inherit similar constraints, likely more severe.

To the extent that real-world interaction is necessary for building intelligent systems, I agree. However, I would urge caution about the assumptions beneath this claim. The argument seems to conflate intelligence with consciousness and remains tethered to a human-centered model. Our sense of being “complete” agents may have less to do with our intelligence than with the fact that we are conscious beings with agency, narrative continuity, and a subjective experience of self. These are not necessarily ingredients of intelligence itself.

Put differently, the belief that a system must be “complete” to be intelligent may reflect our cognitive biases more than any essential feature of intelligence. There are known unknowns, but vastly more unknown unknowns. Our tendency to view intelligence through our own structure and needs, including the need for a stable self, is both understandable and, I would argue, limiting. Therefore, I remain skeptical of the broader engineering paradigm implied by this view. Intelligence, as I understand it, is not easily (or not at all) decomposable, not neatly specifiable, and likely not engineerable in the traditional sense. The divide-and-conquer approach, as well as the trend toward embodied AI, both seem to rely on assumptions that may be more reflective of our own cognitive architecture than of intelligence as a general phenomenon.

—————————-

Regarding the part about the creature and the autonomous system:

I think that section is beautifully written. I genuinely appreciate the author’s perspective on autonomous systems, particularly the argument that we should not build them by simply replicating or reverse-engineering human beings. The idea that intelligent beings need not resemble humans is both intellectually honest and refreshingly imaginative.

It was at this point that I began to realize a fundamental divergence in our views, which is, we seem to define “intelligence” quite differently. While the author never explicitly defines intelligence as a fully autonomous or “creature”, there is an implicit assumption that intelligence and autonomy (or complete autonomous system) must go hand in hand. I find that a bit problematic.

I would like to say that intelligence to me is not an entity; it is not something that must be embodied, complete, or self-sustaining. I would define intelligence as a property; specifically, an emergent phenomenon that can arise within any system regardless of whether it exists in our physical world or not, and therefore it is also It is substrate-independent and can emerge in any dimensionality, any vector space, or even in domains inaccessible to human cognition. I will elaborate on that in my last section.

Take large language models, for instance (though they aren’t my area of research but since everyone knows it I like to use them as examples). I view them as exhibiting intelligence within a non-human, non-physical space; specifically, the vector space of the word embeddings. A well-known example illustrates this: if you take the vector for “king,” subtract “man,” and add “queen,” the result is very close to the vector for “woman.” What’s striking is that no human programmed these relationships; the model learns them entirely from data (or I would rather refer to it as information) on its own.

Understanding language, I suppose, actually means understanding the relationships between its constituent parts and their affordance. What we call “understanding” in humans manifests as the firing of neurons and stable patterns of connectivity in the brain. In deep learning, that same kind of relational understanding appears in vector space. I would argue that understanding is not a special privilege of biology, but a property that emerges wherever sufficient complexity and structure exist. More importantly, understanding cannot be reduced to a list of functions or measured on a single axis. It is an emergent behavior. That is also why I believe LLMs do exhibit a kind of understanding, even though it is clearly not human understanding.

Transformers are often referred to as black boxes. Even the people who designed/invented them cannot fully explain why they work, which is a significant departure from how engineering traditionally works. When we build machines, we used to understand every component. We divide and conquer, reduce the system to its parts and explain the function of each. Now the whole is merely the sum of the parts. In that paradigm, reductionism is both valid and sufficient. Such a paradigm no longer applies to transformers, as their architecture was not derived from first principles or any sorts of human understanding of the cognitions or languages itself but rather tons of experiments, trials and errors done by researchers. Even now, with all the work done on interpreting LLMs and neural networks, we have only begun to scratch the surface. These models behave in ways that defy simple reduction, and in my humble opinion what they exhibit is clearly emergent instead of engineered.

Many researchers (including my coauthor) studying models like BERT or GPT would dismiss them as doing nothing more than large-scale pattern matching; and at the most granular level, that remains true. I understand why they think that way, because ultimately it is nothing more than neurons in NN in transformers collectively recognize patterns, apply nonlinearity, and iterate over massive amounts of data (info). No magic. But if we try to understand them solely at that level, I think we miss the point. It is like trying to understand human thought by analyzing individual neuron firings. To an alien observer, the human brain may appear to be a network of cells triggered by stimuli. If all they saw were neurons flashing and forming patterns with a bit of nonlinearity, they might well conclude that there is no actual consciousness or intelligence or any sorts of understanding at all, just stimulus and the corresponding reaction (another form of pattern matching). But I think most of the homo sapiens who are not philosophers would like to argue the other way around.

Which also brings me to the distinction between intelligence system and consciousness system, something I believe this article fails to address with clarity. I fully agree with the author that intelligent systems may not resemble us. Their structure, understanding, and behavior may diverge entirely from human cognition, and we should not expect them to mirror us. However, despite this acknowledgment, the paper tends to use terms like “intelligent system,” “autonomous system,” and “complete system” interchangeably. This creates an implicit conflation between intelligence, autonomy, and consciousness that I find conceptually limiting.

I would like to provide possible short definitions for these words first. When I refer to the capacity of intelligence, I refer to the capacity to model, infer, relate, and generalize. It does not require self-awareness, agency, or autonomy. Consciousness, on the other hand, implies a subjective experience, a sense of self, a capacity for reflection, self-preservation and agency. 1) If we aim to build systems that are conscious in a human-like sense, then autonomy becomes quite essential here. Such systems would need to perceive the world, form a sense of self, and act in ways tied to agency, more like a being or creature. 2) But if our goal is to build intelligent systems, which are capable of understanding, learning, or performing complex tasks, then autonomy might not be a requirement; they could be passive, embedded, virtual, even entirely disembodied. 3) And if the ultimate goal is to create systems that are both autonomous and intelligent (which is very exciting), for example, an intelligent system consistently and proactively asks questions to itself and solves those questions, coexisting in the physical world, then we should expect them to think and behave very differently from us. Their architecture, their perceptions, their mode of understanding will almost certainly not mirror human cognition. Nor should we expect them to, given the fact that homo sapiens is an extreme case of evolution which set the ultimate goal as lasting long enough for existence, a resistance to the arrow of entropy, survival and reproduction, rather than intelligence.

In this sense, while I agree with the author’s caution against anthropomorphic assumptions, I think the conflation between intelligence, autonomy, and consciousness (he never mentioned this word but as I read it I constantly got a feeling that he implies this word) remains unresolved in the framing. That, to me, is the most significant shortcoming of an otherwise thoughtful and insightful piece.

——— in what follows i would like to share my thoughts/point of views on intelligence and why current ai spring is different than past, it can be totally wrong, and pretty long, only read it when you are not busy —————-

—

I believe that “understanding meaning” is the starting point of intelligence. But I want to emphasize that “understanding”, “intelligence,” and “consciousness” are all emergent phenomena, and none of them necessarily implies or subsumes the others. They may co-occur, but they are not hierarchically ordered. This perspective aligns with the working definition I gave earlier: intelligence is a nonlinear, emergent process, not an engineered function or property of a particular entity.

From this standpoint, whenever we observe emergent properties arising from models, entities, or neural networks, we are already justified in expecting the potential for further emergent properties to follow. That is, the emergence of one form of intelligence becomes a starting point, not an endpoint, for the creation or discovery of others. Importantly, I use the word “create” here intentionally. I do not believe intelligence can be designed or engineered in the conventional sense. It must be created, or perhaps discovered, through systems that themselves give rise to phenomena we cannot fully predict or predefine.

This is also why I find the term AGI limiting, and somewhat anthropocentric. If we think of intelligence as a spectrum, as Nick Bostrom suggests in Superintelligence, then human-level intelligence occupies only a narrow band. When we eventually create a form of intelligence that can improve itself (a system that is not necessarily physically autonomous, but computationally self-developing) we may find that the threshold of AGI is quickly passed. What lies beyond is not just an extension of human capability, but something fundamentally different: the domain of superintelligence, a space that is vastly broader and more unpredictable than the human cognitive range.

For this reason, I believe any serious attempt to build intelligence should not treat human-like generality as the final goal. Superintelligence, not AGI, is the only meaningful and imaginative target. To build something that merely mirrors us risks underestimating the full scope of what intelligence could be.

This belief, that intelligence is emergent, and not reducible to components we can design, also shapes my view of the current moment in AI. I’ve previously discussed the black-box nature of transformers and the unexpected behaviors that arise within large-scale deep learning systems. To me, these are not bugs or byproducts. They are signals. We are seeing, for the first time, a technological emergence that aligns with the theoretical structure of intelligence as I understand it. That is what sets this AI spring apart from the previous ones.

In past cycles, design and understanding preceded system behavior. Researchers engineered components and knew how they worked. Today, that sequence has inverted. We are building systems whose behavior we increasingly observe after the fact. Interpretability research is not an add-on, it has become central because we no longer fully understand what we have built. This shift marks a paradigm break, not just another technological wave. We are no longer simply programming intelligence; we are interacting with its emergence.

This point becomes especially vivid when we look at how deep learning has been applied to meteorology. Weather systems are classic examples of chaotic systems: they are highly sensitive to initial conditions, filled with nonlinear interactions, and governed by emergent phenomena. For decades, researchers attempted to predict weather using physical-based models grounded in well-established laws. But even with increasingly sophisticated simulations, these models consistently failed to surpass certain thresholds of predictive accuracy.

It was only when researchers began to abandon handcrafted physical models and instead fed raw atmospheric data into deep learning systems that a breakthrough occurred. Neural networks, without any prior knowledge of physics, were able to learn patterns and dynamics that outperformed traditional models. This is not just a technical success. It is a paradigm shift: a data-driven model (or I would rather call it information-driven model), operating without an explicit understanding of the underlying rules, proved more effective at capturing the emergent behavior of a chaotic system than models built on human-designed equations.

It should be noted, however, that although meteorology is a chaotic system that exhibits sensitivity and unpredictability, this unpredictability is fundamentally different from the uncertainty principle found in quantum mechanics. In weather systems, everything is, at least in principle, deterministic. The micro-components, such as gas molecules, have well-defined properties, and their interactions follow Newtonian laws. There is no quantum-level randomness at play. From a microscopic perspective, the entire system is governed by classical mechanics. But even with complete knowledge of these components, we still cannot reliably predict the system’s overall behavior. This is the failure of reductionism in the face of emergence.

This distinction matters because in my opinion it also points to a deeper philosophical implication. The unpredictability of intelligent behavior (if intelligence is truly an emergent phenomenon) may share more with meteorology than with quantum mechanics. That is, intelligence may be deterministic at the micro-level, yet fundamentally unpredictable at the macro-level due to its nonlinear, dynamic complexity.

Personally, I find this idea compelling when thinking about the question of free will. If our sense of agency arises from the emergent dynamics of an extraordinarily complex system, like the brain, then the unpredictability we associate with free will may not require metaphysical indeterminism. Instead, it could be an inevitable result of a deterministic but chaotic system whose behavior cannot be reduced to its parts. In this light, the freedom we experience (whether it is objectively real or simply illusory, this is out of the discussion here) might be a natural byproduct of emergent unpredictability, not a contradiction of physical law but an expression of it.

The above framing also reinforces why I see the current moment in AI as fundamentally different from past cycles. In earlier waves, models were limited by how much we could encode our understanding into them. But systems built on deep learning do not depend on human-derived rules. They harness complexity and let structure emerge. Just as meteorology became more predictable once we stopped trying to model it top-down, intelligence becomes more visible when we stop insisting it must resemble us or fit within frameworks we fully understand.

We are not witnessing better tools for an old paradigm. We are watching the early signs of a new one: one where intelligence is not built but brought forth, and where our role gradually shifts from engineers to observers, trying to interpret what we have created after it surprises us.

This, to me, is the core reason why this AI spring is not like the previous ones. Not because the tools are more powerful, but because the underlying philosophy has begun to change.

On “Intelligence without representation”

Leave a Reply Cancel reply

Why Time Don’t Exist

Can deep learning replace the traditional laws of physics??

Thoughts on 宇宙的另一种真相｜Another Truth of the Universe – 9/10

Infinity

Leave a Reply Cancel reply

Related Posts