• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Core Principles of Responsible AI
    • Accountability
    • Fairness
    • Privacy
    • Safety and Security
    • Sustainability
    • Transparency
  • Special Topics
    • AI in Industry
    • Ethical Implications
    • Human-Centered Design
    • Regulatory Landscape
    • Technical Methods
  • Living Dictionary
  • State of AI Ethics
  • AI Ethics Brief
  • 🇫🇷
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

Language Models: A Guide for the Perplexed

February 6, 2024

🔬 Research Summary by Sofia Serrano, a Ph.D. candidate in computer science at the University of Washington (and will be an Assistant Professor at Lafayette College starting in Autumn 2024), specifically focusing on the interpretability of contemporary natural language processing models.

[Original paper by Sofia Serrano, Zander Brumbaugh, and Noah A. Smith]


Overview: Language models are seemingly everywhere in the news, but their explanations are either very high-level or technical and geared toward experts. As natural language processing (NLP) researchers, we thought it would be helpful for us to write a guide for readers outside of NLP who are interested in a more in-depth look at how language models work, the factors that have contributed to their recent development, and how they might continue to develop.


Introduction

Given the growing importance of AI literacy, we decided to write this tutorial on language models (LMs) to help narrow the gap between the discourse among those who study LMs—the core technology underlying ChatGPT and similar products—and those who are intrigued and want to learn more about them. In short, we believe the perspective of researchers and educators can clarify the public’s understanding of the technologies beyond what’s currently available, which tends to be either extremely technical or promotional material generated about products by their purveyors.

Our approach teases apart the concept of a language model (LM) from products built on them, from the behaviors attributed to or desired from those products, and claims about similarity to human cognition. As a starting point, we (1) offer a scientific viewpoint that focuses on questions amenable to study through experimentation; (2) situate language models as they are today in the context of the research that led to their development; and (3) describe the boundaries of what is known about the models at this writing.

Key Insights

Tasks, Data, and Evaluation Methods

To understand the last few years’ developments around language models, it’s helpful to have some context about the research field that produced them. Therefore, we begin our guide by explaining how the field of NLP has typically approached building computer systems to work with text in the last couple of decades.

The first idea we discuss is how NLP researchers turn idealized things we’d like a computer to be able to do, like “have an understanding of grammar,” “write coherently,” or “translate between languages,” into a simplified problem on which we can begin to chip away. These simplified problems are known as “tasks” and turn a desired computer behavior like “translating between languages” into something more concrete like “given an English sentence, translate it into French.”

Crucially, there is a gap between idealized computer behavior and the “tasks” they are simplified into— to use our translation example, anyone who’s read the same book in two different languages can tell you that there is an art to how human translators balance faithfulness to the original work and the conventions of the work’s new language, to avoid stilted prose. This process often involves slightly rearranging sentences so that there aren’t exactly the same number of sentences in the two versions of the book, and our distillation of “translating text between languages” into translating sentence-for-sentence obscures that. But making progress towards that intermediate stepping stone of a task helps to make progress towards the larger goal.

We then discuss how deciding on a source of data and an evaluation method for a given simplified task lend themselves to training neural network-based models that perform that task.

The “Language Modeling” Task: Next-Word Prediction

With all that said, what task have language models been trained to perform? As it turns out, their task is next-word prediction, which has already been known for many years as “language modeling” in NLP. In other words, given some text in progress, like “This document is about natural language _____,” a language model is trained to try to predict the next word. (For our example, “processing” would be a reasonable guess.)

While language models have been around in NLP for a long time, it was only recently that researchers began to recognize that past a certain point, to do really well on language modeling, a language model needed to pick up certain facts and world knowledge (for example, to do well at filling in the blanks for “The Declaration of Independence was signed by the Second Continental Congress in the year ____,” or “When the boy received a birthday gift from his friends, he felt ____”).

But even today, the training of language models is still based on optimizing for low “perplexity”—that is, the same measure of a language model’s word-by-word “surprise” at the true, revealed continuation of text-in-progress that we’ve been using in NLP for decades.

Getting from Language Models to Today’s Large Language Models

While perplexity has continued to be our central quantity of interest for language models, that’s not to say that nothing has changed in the last few years about how language models are developed. We discuss two key changes: a move towards training on far more data and also the adoption of a type of neural network called the “transformer,” which is structured in such a way as to enable faster training on more data (provided a model developer has access to certain hardware—specifically GPUs—with a lot of memory).

We then discuss a few of the impacts of those changes and of the resulting surge in performance on language modeling. For example, we discuss how language models are now commonly used to perform other “tasks” that would have involved separately trained models a few years ago and how moving towards larger models has contributed to current NLP models’ relative inscrutability. We also talk about how the rising cost of developing new language models has considerably narrowed the field of which entities/companies can now afford to produce them, the current strategies they use to adapt LMs for use as products, and how difficult it is to evaluate LMs.

Implications of How Language Models Work for Common Questions About Them

Based on our earlier discussion of how language models work, we address a few common questions about using language models, including the importance of particular prompts and which kinds of things are essential to check language model output. We also offer a bit of context for discussions around whether language models count as “intelligent.” However, this is largely a side question for most people considering LMs.

Where Language Models are Headed

We close with some parting words about the difficulty of making projections about the future of LMs and the development of the regulation landscape around LMs. Finally, we list a few helpful actions that people reading the guide can consider moving forward to contribute to a healthy AI landscape.

Between the lines

Current language models are downright perplexing! By considering the trends in the research communities that produced them, we understand why these models behave as they do. Keeping in mind the primary task these models have been trained to accomplish, i.e., next-word prediction, also helps us understand how they work.

Many open questions about these models remain—ranging from how to steer models away from generating incorrect information to how best to customize models for different use cases to which strategies to use to democratize their development. However, we hope our tutorial can provide some helpful guidance on using and assessing LMs.

Though determining how these technologies will continue to develop is difficult, there are helpful actions that each of us can take to push that development in a positive direction. By broadening the number and type of people involved in decisions about model development and engaging in broader conversations about the role of LMs and AI in society, we can all help shape AI systems into a positive force.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

Canada’s Minister of AI and Digital Innovation is a Historic First. Here’s What We Recommend.

Am I Literate? Redefining Literacy in the Age of Artificial Intelligence

AI Policy Corner: The Texas Responsible AI Governance Act

AI Policy Corner: Singapore’s National AI Strategy 2.0

AI Governance in a Competitive World: Balancing Innovation, Regulation and Ethics | Point Zero Forum 2025

related posts

  • ChatGPT and the media in the Global South: How non-representative corpus in sub-Sahara Africa are en...

    ChatGPT and the media in the Global South: How non-representative corpus in sub-Sahara Africa are en...

  • “A Proposal for Identifying and Managing Bias in Artificial Intelligence”. A draft from the NIST

    “A Proposal for Identifying and Managing Bias in Artificial Intelligence”. A draft from the NIST

  • The Meaning of “Explainability Fosters Trust in AI”

    The Meaning of “Explainability Fosters Trust in AI”

  • Research summary: Evasion Attacks Against Machine Learning at Test Time

    Research summary: Evasion Attacks Against Machine Learning at Test Time

  • The AI Carbon Footprint and Responsibilities of AI Scientists

    The AI Carbon Footprint and Responsibilities of AI Scientists

  • Beyond Empirical Windowing: An Attention-Based Approach for Trust Prediction in Autonomous Vehicles

    Beyond Empirical Windowing: An Attention-Based Approach for Trust Prediction in Autonomous Vehicles

  • Post-Mortem Privacy 2.0: Theory, Law and Technology

    Post-Mortem Privacy 2.0: Theory, Law and Technology

  • Reports on Communication Surveillance in Botswana, Malawi and the DRC, and the Chinese Digital Infra...

    Reports on Communication Surveillance in Botswana, Malawi and the DRC, and the Chinese Digital Infra...

  • The Ethics of AI Business Practices: A Review of 47 AI Ethics Guidelines

    The Ethics of AI Business Practices: A Review of 47 AI Ethics Guidelines

  • Justice in Misinformation Detection Systems

    Justice in Misinformation Detection Systems

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer

Categories


• Blog
• Research Summaries
• Columns
• Core Principles of Responsible AI
• Special Topics

Signature Content


• The State Of AI Ethics

• The Living Dictionary

• The AI Ethics Brief

Learn More


• About

• Open Access Policy

• Contributions Policy

• Editorial Stance on AI Tools

• Press

• Donate

• Contact

The AI Ethics Brief (bi-weekly newsletter)

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.


Archive

  • © MONTREAL AI ETHICS INSTITUTE. All rights reserved 2024.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.