• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Core Principles of Responsible AI
    • Accountability
    • Fairness
    • Privacy
    • Safety and Security
    • Sustainability
    • Transparency
  • Special Topics
    • AI in Industry
    • Ethical Implications
    • Human-Centered Design
    • Regulatory Landscape
    • Technical Methods
  • Living Dictionary
  • State of AI Ethics
  • AI Ethics Brief
  • 🇫🇷
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

Humans are not Boltzmann Distributions: Challenges and Opportunities for Modelling Human Feedback and Interaction in Reinforcement Learning

May 21, 2023

🔬 Research Summary by David Lindner, a doctoral student at ETH Zurich working on reinforcement learning from human feedback.

[Original paper by David Lindner, Mennatallah El-Assady]


Overview: Current work in human-in-the-loop reinforcement learning often assumes humans are noisily rational and unbiased. This paper argues that this assumption is too simplistic and calls for developing more realistic models that consider human feedback’s personal, contextual, and dynamic nature. The paper encourages interdisciplinary approaches to address these open questions and make human-in-the-loop (HITL) RL more robust.


Introduction

In the past few years, we’ve seen a lot of impressive demonstrations of Reinforcement Learning (RL). But most of them focus on simulations with a well-specified goal, which includes video games and board games.

But what happens in practical applications where, instead of a simulation, a human provides feedback to RL agents? Then the agent has to model the human’s decision-making process and adapt to it during training.

So far, most methods in RL assume that humans are noisily rational and, importantly, unbiased. We argue that this is an oversimplification and that RL algorithms need more realistic models of human interaction to be effective in real human-in-the-loop scenarios.

The paper reviews common approaches to handling human feedback in RL and discusses key research challenges and opportunities in three categories: personalization, contextualization, and adaptiveness.

Key Insights

Collaborative Human-AI Interaction

First, let’s review different types of human-AI interaction occurring in RL applications. For our discussion, we distinguish three categories of interaction: instruction, evaluation, and cooperation.

Instruction

A human can instruct the agent by telling it what to do. This is common in imitation or inverse reinforcement learning, where the agent learns a task from human demonstrations. For example, a human expert might teleoperate a robotic arm to generate these demonstrations. But instruction can also occur in more subtle forms, for example, if a human physically corrects a robotic arm’s trajectory. 

Evaluation

A human can provide evaluations by telling the agent how well it is doing. This is common in reinforcement learning from human feedback (RLHF), where humans, for example, decide which of multiple trajectories they prefer. There are also implicit forms of evaluation, for example, user engagement in recommender systems.

Cooperation

Many practical applications need more complicated forms of human-AI interactions, which are best modeled as cooperation. For example, a human worker and a robot might work together to assemble a product and learn from each other.

Challenges in Human Feedback Modelling

Current models of human feedback in RL are very simple. They usually assume a human user has fixed preferences and takes actions rationally with some constant noise model. Our paper highlights unsolved challenges to modeling real human feedback along three dimensions: feedback must be modeled as personal, contextual, and adaptive.

Personalized Feedback

Different humans will interact differently with RL systems, and we cannot hope to find a universal model of this interaction. The interaction will, for example, depend on personality factors and prior knowledge of the task and the system. To build accurate models of human feedback, we must take these factors into account.

Contextualized Feedback

Beyond personal factors, we must also take the (sociotechnical) context into account how a human interacts with an RL agent will depend on when and where the interaction is happening. For example, a medical doctor might be more careful about evaluating the recommendations of an AI assistant than the average user of a personal assistant on their smartphone. Such contextual effects also interact with personality effects and have to be modeled together.

Adaptive Feedback

During an interaction, both the user and the AI system will accumulate knowledge that changes how they interact with each other. Other factors, such as the user’s energy and level of motivation, might also change during the interaction. To accurately model human-AI interaction, we must measure, predict, and adapt to such changes.

Implications for RL

To enable RL systems to learn from and adapt to different people and contexts, we must model the interaction along these three dimensions and design RL systems accordingly.

We need to ensure the whole interaction is personalized and considers each user’s preferences. We must ensure that the system design is appropriate for the interaction context. And we need to measure and predict changes in the interaction and design systems that adapt to these changes.

Moving to more realistic models of human feedback will likely make the corresponding RL problems more difficult. However, a more realistic approach to modeling humans will make the RL algorithms we design more robust and applicable to practical situations.

Importantly, modeling human feedback is more than just a technical challenge. Some of the most important open research questions we discuss in our paper must be answered from a human-centered and interdisciplinary perspective.

Between the lines

Human-in-the-loop RL aims to design systems that can interact with real humans in real-world situations. Current research toward this goal is held back by using too simplistic human models. The research questions in this paper pose a difficult interdisciplinary research challenge. But solving this challenge will help us to move from RL systems that work well with simulated humans and in very basic user studies to RL systems that can interact with humans in the real world.

We plan to start an interdisciplinary discussion around these topics, which allows reinforcement learning researchers to look beyond the immediate technical problems they are solving and move towards designing and deploying human-centered RL systems.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

Canada’s Minister of AI and Digital Innovation is a Historic First. Here’s What We Recommend.

Am I Literate? Redefining Literacy in the Age of Artificial Intelligence

AI Policy Corner: The Texas Responsible AI Governance Act

AI Policy Corner: Singapore’s National AI Strategy 2.0

AI Governance in a Competitive World: Balancing Innovation, Regulation and Ethics | Point Zero Forum 2025

related posts

  • Experimenting with Zero-Knowledge Proofs of Training

    Experimenting with Zero-Knowledge Proofs of Training

  • From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting

    From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting

  • Combatting Anti-Blackness in the AI Community

    Combatting Anti-Blackness in the AI Community

  • The Proliferation of AI Ethics Principles: What's Next?

    The Proliferation of AI Ethics Principles: What's Next?

  • Can You Meaningfully Consent in Eight Seconds? Identifying Ethical Issues with Verbal Consent for Vo...

    Can You Meaningfully Consent in Eight Seconds? Identifying Ethical Issues with Verbal Consent for Vo...

  • Evaluating the Social Impact of Generative AI Systems in Systems and Society

    Evaluating the Social Impact of Generative AI Systems in Systems and Society

  • Quantifying the Carbon Emissions of Machine Learning

    Quantifying the Carbon Emissions of Machine Learning

  • The Chinese Approach to AI: An Analysis of Policy, Ethics, and Regulation

    The Chinese Approach to AI: An Analysis of Policy, Ethics, and Regulation

  • UNESCO’s Recommendation on the Ethics of AI

    UNESCO’s Recommendation on the Ethics of AI

  • The Grand Illusion: The Myth of Software Portability and Implications for ML Progress

    The Grand Illusion: The Myth of Software Portability and Implications for ML Progress

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer

Categories


• Blog
• Research Summaries
• Columns
• Core Principles of Responsible AI
• Special Topics

Signature Content


• The State Of AI Ethics

• The Living Dictionary

• The AI Ethics Brief

Learn More


• About

• Open Access Policy

• Contributions Policy

• Editorial Stance on AI Tools

• Press

• Donate

• Contact

The AI Ethics Brief (bi-weekly newsletter)

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.


Archive

  • © MONTREAL AI ETHICS INSTITUTE. All rights reserved 2024.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.