• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Core Principles of Responsible AI
    • Accountability
    • Fairness
    • Privacy
    • Safety and Security
    • Sustainability
    • Transparency
  • Special Topics
    • AI in Industry
    • Ethical Implications
    • Human-Centered Design
    • Regulatory Landscape
    • Technical Methods
  • Living Dictionary
  • State of AI Ethics
  • AI Ethics Brief
  • 🇫🇷
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models

January 14, 2024

🔬 Research Summary by Leyang Cui, a senior researcher at Tencent AI lab.

[Original paper by Yue Zhang , Yafu Li , Leyang Cui, Deng Cai , Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao , Yu Zhang , Yulong Chen, Longyue Wang , Anh Tuan Luu, Wei Bi, Freda Shi, and Shuming Shi]


Overview: With their remarkable ability to understand and generate human language, Large language models (LLMs) like GPT-4 have significantly impacted our daily lives. However, a major concern regarding the reliability of LLM applications lies in hallucinations. This paper presents a comprehensive survey of hallucinations, including definitions, causes, evaluation, and mitigation methods.


Introduction

Large language models (LLMs) have become a promising cornerstone for developing natural language processing and artificial intelligence. LLMs have shown strong capability in understanding and generating human languages.

Despite their remarkable success, LLMs may sometimes produce content that deviates from user input, contradicts previously generated context, or is misaligned with well-established world knowledge. This phenomenon is commonly referred to as hallucination, which significantly undermines the reliability of LLMs in real-world scenarios.  

Addressing hallucination in LLMs faces unique challenges due to massive training data, versatility, and imperceptible errors. LLM pre-training uses trillions of tokens from the web, making it hard to eliminate unreliable information. General-purpose LLMs must excel in various settings, complicating evaluation and mitigation efforts. Additionally, LLMs can generate seemingly plausible false information, making hallucination detection difficult for models and humans alike.

This paper introduces LLMs’ background, defines hallucination, presents relevant benchmarks and metrics, discusses LLM hallucination sources, reviews recent work addressing the issue, and offers forward-looking perspectives.

Key Insights

What is an LLM hallucination?

We categorize hallucination within the context of LLMs as follows: 

Input-conflicting hallucination: LLMs generate content that deviates from the source input provided by users; 

Context-conflicting hallucination: LLMs generate content that conflicts with previously generated information by itself; Fact-conflicting hallucination: LLMs generate content that is not faithful to established world knowledge.

Sources of LLM Hallucination

Various factors may induce hallucinations with LLMs. 

  1. Lack of relevant knowledge or internalized false knowledge: the knowledge of LLMs is mostly acquired during the pretraining phase. When asked to answer questions or complete tasks, LLMs often exhibit hallucinations if they lack pertinent knowledge or have internalized false knowledge from the training corpora.
  2. LLMs sometimes overestimate their capacities: LLMs’ understanding of factual knowledge boundaries may be imprecise, and they frequently exhibit overconfidence. Such overconfidence misleads LLMs to fabricate answers with unwarranted certainty.
  3. Problematic alignment process could mislead LLMs into hallucination: During the supervised fine-tuning, LLMs do not acquire prerequisite knowledge from the pre-training phase. This is actually a misalignment process that encourages LLMs to hallucinate.
  4. Auto-regressive generation: LLMs sometimes over-commit to their early mistakes, even when they recognize they are incorrect.

Evaluation of LLM Hallucination

There are two benchmark categories for evaluating LLM hallucination: generation and discrimination. The former assesses the ability of LLMs to produce factual statements, while the latter concentrates on determining if LLMs can distinguish factual statements from a set of candidates.

Mitigation of LLM Hallucination

Pre-training: The mitigation of hallucinations during pre-training is primarily centered around the curation of pre-training corpora. Given the vast scale of existing pre-training corpora, current studies predominantly employ simple heuristic rules for data selection and filtering.

Supervised Fine-tuning (SFT): Thanks to the acceptable volume of SFT data, human experts can manually curate them. Recently, we have performed a preliminary human inspection and observed that some widely-used synthetic SFT data, such as Alpaca, contains a considerable amount of hallucinated answers due to the lack of human inspection. 

Reinforcement Learning from Human Feedback (RLHF): RLHF guides LLMs in exploring their knowledge boundaries, enabling them to decline to answer questions beyond their capacity rather than fabricating untruthful responses. However, RL-tuned LLMs may exhibit over-conservatism (e.g., refrain from providing a clear answer) due to an imbalanced trade-off between helpfulness and honesty.

Inference: Designing decoding strategies to mitigate hallucinations in LLMs during inference is typically plug-and-play. Therefore, this method is easy to deploy, making it promising for practical applications. 

Between the lines

Hallucination remains a critical challenge that impedes the practical application of LLMs. This survey offers a comprehensive review of the most recent advances that aim to evaluate, trace, and eliminate hallucinations within LLMs.  We also delve into the existing challenges and discuss potential future directions. We aspire for this survey to serve as a valuable resource for researchers intrigued by the mystery of LLM hallucinations, thereby fostering the practical application of LLMs.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

AI Policy Corner: Frontier AI Safety Commitments, AI Seoul Summit 2024

AI Policy Corner: The Colorado State Deepfakes Act

Special Edition: Honouring the Legacy of Abhishek Gupta (1992–2024)

AI Policy Corner: The Turkish Artificial Intelligence Law Proposal

From Funding Crisis to AI Misuse: Critical Digital Rights Challenges from RightsCon 2025

related posts

  • The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks (Research Summa...

    The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks (Research Summa...

  • Mapping the Responsible AI Profession, A Field in Formation (techUK)

    Mapping the Responsible AI Profession, A Field in Formation (techUK)

  • Slow AI and The Culture of Speed

    Slow AI and The Culture of Speed

  • Research summary: Challenges in Supporting Exploratory Search through Voice Assistants

    Research summary: Challenges in Supporting Exploratory Search through Voice Assistants

  • The Ethics Owners — A New Model of Organizational Responsibility in Data-Driven Technology Companies...

    The Ethics Owners — A New Model of Organizational Responsibility in Data-Driven Technology Companies...

  • Transparency as design publicity: explaining and justifying inscrutable algorithms

    Transparency as design publicity: explaining and justifying inscrutable algorithms

  • Research summary: Politics of Adversarial Machine Learning

    Research summary: Politics of Adversarial Machine Learning

  • FairQueue: Rethinking Prompt Learning for Fair Text-to-Image Generation (NeurIPS 2024)

    FairQueue: Rethinking Prompt Learning for Fair Text-to-Image Generation (NeurIPS 2024)

  • The AI Carbon Footprint and Responsibilities of AI Scientists

    The AI Carbon Footprint and Responsibilities of AI Scientists

  • AI Certification: Advancing Ethical Practice by Reducing Information Asymmetries

    AI Certification: Advancing Ethical Practice by Reducing Information Asymmetries

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer

Categories


• Blog
• Research Summaries
• Columns
• Core Principles of Responsible AI
• Special Topics

Signature Content


• The State Of AI Ethics

• The Living Dictionary

• The AI Ethics Brief

Learn More


• About

• Open Access Policy

• Contributions Policy

• Editorial Stance on AI Tools

• Press

• Donate

• Contact

The AI Ethics Brief (bi-weekly newsletter)

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.


Archive

  • © MONTREAL AI ETHICS INSTITUTE. All rights reserved 2024.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.