🔬 Research Summary by Murray Shanahan, a senior research scientist at Google DeepMind and Professor of Cognitive Robotics at Imperial College London.
[Original paper by Murray Shanahan]
Overview: The words we use to talk about current large language models (LLMs) can amplify the tendency to see them as human-like. Yet LLM-based dialogue agents are so fundamentally different from human language users that we cannot assume their behavior will conform to normal human expectations. So while it is natural to use everyday psychological terms like “believes,” “knows,” and “thinks” in the context of these systems, we should do so with caution and avoid anthropomorphism.
Introduction
Do large language models have beliefs? Do they really know anything? Do they have thoughts? These are important questions, and they tend to elicit strong opinions in people. According to this paper, the right question to ask is whether we should extend the use of familiar but philosophically contentious words like “believes,” “knows,” and “thinks” to the exotic setting of LLMs and the systems built on them. Do the nature and behavior of these artifacts warrant using such words?
Key Insights
The first important distinction the paper makes is between bare-bones language models (e.g., GPT-4) and the systems, such as dialogue agents (e.g., ChatGPT), based on those models. The bare-bones LLM is a computational object that takes a sequence of tokens (words) as input and returns a distribution over words as output. This distribution represents the model’s prediction of what the next word in the sequence is likely to be.
Although we might speak, elliptically, of the knowledge it encodes, much as we speak of the knowledge encoded in an encyclopedia, the bare-bones LLM, by itself, has such a limited repertoire of behavior that it doesn’t make conceptual sense to say it has beliefs. In particular, as the paper argues, once trained, a bare-bones LLM “has no [further] access to any external reality against which its words might be measured, nor the means to apply any other external criteria of truth, such as agreement with other language-users.”
By interacting with a user, LLM-based dialogue agents (as opposed to bare-bones LLMs) do have some contact with external reality, so at least the question of whether they have beliefs starts to make conceptual sense. However, if a dialogue agent’s only connection with the outside world is linguistic interaction with a human user, then it isn’t appropriate to ascribe beliefs to it in the fullest sense. Such a dialogue agent “cannot participate fully in the human language game of truth because it does not inhabit the world we human language-users share.” In particular, as a disembodied entity, it cannot investigate the world directly as a human or animal can.
There are many ways to move beyond such basic dialogue agents, including multi-modality (e.g., incorporating visual input), tool use (e.g., the ability to consult external websites), and embodiment (e.g., embedding a dialogue system in a robot controller). Each of these extensions provides a richer form of interaction between an LLM-based system and the external world, and each further legitimizes the ascription of beliefs to the resulting systems.
We’re not there yet, but eventually, perhaps, with further progress along these lines, there will be no need to caveat our use of words like “believes” in the context of AI systems based on LLMs. Nevertheless, we should still beware of anthropomorphizing such systems. “The sudden presence among us of exotic, mind-like entities might precipitate a shift in the way we use familiar psychological terms … [But] it may require an extensive period of interacting with, of living with, these new kinds of artifact before we learn how best to talk about them.”
Between the lines
A casual reader of the paper might interpret it as taking a skeptical stance toward artificial intelligence. However, this would be a misreading in two ways. First, the main target of the paper is disembodied large language models, not artificial intelligence more generally, and especially not AI systems that are embodied, either physically in robots or virtually in simulated worlds.
Second, the philosophical position behind the paper is not prescriptive. It is not assumed that there are facts of the matter, underwritten by metaphysics, about the nature of belief, knowledge, thought, and so on. The focus, rather, is on how words are used, and the strategy is to remind us how certain philosophically tricky words are used in their original setting, which is humans interacting with other humans.
Nevertheless, the paper is predominantly critical. It cautions against the use of certain words in the context of LLMs, and the only alternative forms of description it suggests are low-level and mechanistic. A companion paper advocates a more nuanced alternative: to frame the behavior of LLM-based dialogue systems in terms of role-play ([2305.16367] Role-Play with Large Language Models). This framing legitimizes the use of everyday psychological terms while avoiding the pitfalls of anthropomorphism.