🔬 Research Summary by Nora Belrose, Frances Lorenz, Jon Menaster. Nora is an independent AI researcher focusing on active reward learning and robustness. Frances supports researchers (and aspiring researchers) working to address risks posed by advanced artificial intelligence. Jon is a senior policy analyst and project manager with the U.S. Government Accountability Office focusing on AI governance and policy issues
[Original paper by Scott Reed, Konrad Zolna, Emilio Parisotto, Sergio Gomez Colmenarejo, Alexander Novikov, Gabriel Barth-Maron, Mai Gimenez, Yury Sulsky, Jackie Kay, Jost Tobias Springenberg, Tom Eccles, Jake Bruce, Ali Razavi, Ashley Edwards, Nicolas Heess, Yutian Chen, Raia Hadsell, Oriol Vinyals, Mahyar Bordbar, Nando de Freitas]
Overview: This paper introduces a single neural network capable of performing hundreds of distinct tasks, including: chatting, stacking blocks with a real robot arm, captioning images, and more. The successes and limitations of training a general agent are discussed, as well as the implications to machine safety.
DeepMind has just introduced its new agent, Gato: the most general machine learning (ML) model to date. If you’re familiar with arguments for the potential risks posed by advanced AI systems, you’ll know the term general carries strong implications. Today’s ML systems are advancing quickly; however, even the best systems we see are narrow in the tasks they can accomplish. For example, DALL-E impressively generates images that rival human creativity; however, it doesn’t do anything else. Similarly, large language models like GPT-3 perform well on certain text-based tasks, like sentence completion, but poorly on others, such as arithmetic.
If future AI systems are to exhibit human-like intelligence, they’ll need to use various skills and information to complete diverse tasks across different contexts. In other words, they’ll need to exhibit general intelligence in the same way humans do—a type of system broadly referred to as artificial general intelligence (AGI). While AGI systems could lead to hugely positive innovations, they also have the potential to surpass human intelligence and become “superintelligent”. If a superintelligent system were unaligned, it could be difficult or even impossible to control for and predict its behavior, leaving humans vulnerable.
So what exactly has DeepMind created? Gato is a single neural network capable of performing hundreds of distinct tasks. According to DeepMind, it can, “play Atari, caption images, chat, stack blocks with a real robot arm and much more, deciding based on its context whether to output text, joint torques, button presses, or other tokens.” It’s not currently analogous to human-like intelligence; however, it does exhibit general capabilities. In the rest of this post, we’ll provide a non-technical summary of DeepMind’s paper and explore: (i) what this means for potential future existential risks posed by advanced AI and (ii) some relevant AI policy considerations.
How was Gato built?
The technique used to train Gato is slightly different from other famous AI agents. For example, AlphaGo, the AI system that defeated world champion Go player Lee Sedol in 2016, was trained largely using a sophisticated form of trial and error called reinforcement learning (RL). While the initial training process involved some demonstrations from expert Go players, the next iteration named AlphaGo Zero removed these entirely, mastering games solely by playing itself.
By contrast, Gato was trained to imitate examples of “good” behavior in 604 distinct tasks.
These tasks include:
- Simulated control tasks, where Gato has to control a virtual body in a simulated environment.
- Vision and language tasks, like labeling images with corresponding text captions.
- Robotics, specifically the common RL task of stacking blocks.
Examples of good behavior were collected in a few different ways. For simulated control and robotics, examples were collected from other, more specialized AI agents trained using RL. For vision and language tasks, “behavior” took the form of text and images generated by humans, largely scraped from the web.
Gato was tested on a range of control tasks by taking the average of 50 performances for each. These averages were compared to the results achieved by specialist agents trained and fine-tuned to each specific control task. It’s key to remember, Gato has also been trained on language, vision, and robotics data, all of which needs to be stored and represented within the model. In one sense, this puts Gato at a disadvantage compared to its task-specific competitors, as there’s potential for learning one task to interfere with learning others. On the other hand, Gato has the opportunity to find commonalities between tasks, allowing it to learn more quickly. Overall, we see that Gato fairs okay. It achieves at least 50% of the performance of task-specific experts in 450 tasks, and matches specialist performance in nearly 200 tasks, mostly in 3D simulated control.
Gato’s ability to stack shapes was tested and compared to a task-specific, state-of-the-art network. Gato performed about as well as state-of-the-art.
To quote directly from the paper, “[Gato] demonstrates rudimentary dialogue and image captioning capabilities.”
Accelerated learning on new tasks
An important aspect of intelligence is the ability to quickly learn new tasks by using knowledge and experience from tasks you’ve already mastered. With that in mind, DeepMind hypothesized that “…training an agent which is generally capable on a large number of tasks is possible; and that this general agent can be adapted with little extra data to succeed at an even larger number of tasks.”
To test this, DeepMind took a trained Gato model and fine-tuned it on a small set of demonstrations from novel tasks, not present in its training set. They then compared Gato’s performance to a randomly initialized, “blank slate” model trained solely on these same demonstrations. They found that accelerated learning does happen, but only when the new tasks are similar in some way to tasks Gato’s already seen— for example, a Gato model trained on continuous control tasks learned faster on novel control tasks, but a model trained only on text and images showed no such improvement.
Scaling laws are an observed trend that show ML techniques tend to predictably improve when scaled up using larger models, more data, and more compute resources. Thus, we can use smaller models to reasonably extrapolate how well a larger model might perform; though it’s worth noting scaling laws aren’t guaranteed to hold.
Gato was evaluated at 3 different model sizes – the largest of which was relatively small compared to recent advanced models. On Twitter, Lennart Heim estimates it’d cost around $50K to train Gato in GCloud (which allows you to access compute resources from Google), compared to $11M+ for PaLM (a new, state-of-the-art language model). Looking at the 3 different Gato models, we see increased performance with increased size and a typical scaling curve. Thus, it seems likely larger versions of Gato will perform much better than what we’ve described here. There are limits, however: scaling alone would not allow Gato to exceed expert performance on diverse tasks, since it is trained to imitate the experts rather than to explore new behaviors and perform in novel ways. It remains to be seen how hard it will be to train Gato-like generalist agents that can outperform specialist systems.
Between the lines
What are the potential near-term harms from Gato?
Gato, like many other AI models, can produce biased or harmful output. This is partly due to biases present in the vision and language datasets used for training, which include “racist, sexist, and otherwise harmful content.” Conceivably, Gato could physically harm people while performing a robotics task. DeepMind attempted to mitigate harms by filtering sexually explicit content and implementing safety measures for their robotic systems. However, given that the paper did not discuss other mitigation attempts, harmful output is still a concern.
What are the implications of Gato with respect to existential risk?
Many experts are concerned that superhuman-level AGI will pose an existential risk to human civilization, especially if its goals are not closely aligned with ours. Gato seems to mark a step towards this kind of general AI. Metaculus, a community that allows anyone to submit predictions about the future, now estimates AGI will arrive in 2035— about a decade earlier than its estimate before the announcement of Gato. This date is an aggregation of 423 individual predictions, based on a definition of AGI that includes a set of technical benchmarks, such as the system successfully passing a Turing test involving textual, visual, and auditory components.
If Gato causes us to update our beliefs toward shorter timelines for the development of AGI, we have less time than we thought to solve the alignment problem. This could make the case for pursuing direct technical work on alignment, increasing community-building, support, or policy roles for alignment, or allocating more resources to research and governance.
It’s worth noting, however, that there are some less impressive aspects of Gato. Fundamentally, Gato is trained to imitate specialist RL agents and humans–and it did not significantly outperform the agents it learned from. Arguably, it would have been more impressive if Gato could exploit its diverse knowledge to devise new behaviors that outperform specialist agents on several tasks.
What are some policy considerations related to Gato?
In the United States, AI systems are generally regulated by the agency overseeing the particular sector or industry they are designed to operate within. For example, in 2019 the U.S. Food and Drug Administration issued a proposed regulatory framework for AI/ML-based software used in health care settings. Less than a week ago, the U.S. Justice Department and the Equal Employment Opportunity Commission released guidance and technical assistance documents around avoiding disability discrimination when using AI for hiring decisions. However, because Gato is a generalist agent that can work across many domains, and therefore industries, it may be unclear which regulatory agency has the responsibility or authority to ensure Gato’s development and deployment (or other systems like it) remain in compliance with applicable laws.
There are a variety of regulatory frameworks in development across the globe designed to more broadly oversee AI (such as the European Union’s AI Act), but the extent to which they are being developed with a generalist AI system in mind is unclear. Now that Gato is here, regulators may want to ask themselves:
- To what extent might current regulatory frameworks need to be modified to better fit this new paradigm?
- How can we properly coordinate and collaborate our oversight of generalist AI systems to ensure there is no regulatory duplication, overlap, or fragmentation?
- How, if at all, can we future proof the more universal frameworks currently in development to better oversee these types of generalist AI systems?
As a potential path forward, the Future of Life Institute suggests adding a specific definition of general AI to the EU AI Act and clearly describing the roles and responsibilities of developers of generalist AI systems, including assessing potential misuse and regularly checking for new risks as the system evolves. Their idea is to require developers of general AI systems to ensure their systems’ safety, while reducing compliance burdens for the companies and other end users who might use the systems for a wide variety of tasks.