• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
    • Tech Futures
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • State of AI Ethics Report Volume 8 (2026): Call for Contributors
    • Volume 7 (November 2025)
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

Humans are not Boltzmann Distributions: Challenges and Opportunities for Modelling Human Feedback and Interaction in Reinforcement Learning

May 21, 2023

🔬 Research Summary by David Lindner, a doctoral student at ETH Zurich working on reinforcement learning from human feedback.

[Original paper by David Lindner, Mennatallah El-Assady]


Overview: Current work in human-in-the-loop reinforcement learning often assumes humans are noisily rational and unbiased. This paper argues that this assumption is too simplistic and calls for developing more realistic models that consider human feedback’s personal, contextual, and dynamic nature. The paper encourages interdisciplinary approaches to address these open questions and make human-in-the-loop (HITL) RL more robust.


Introduction

In the past few years, we’ve seen a lot of impressive demonstrations of Reinforcement Learning (RL). But most of them focus on simulations with a well-specified goal, which includes video games and board games.

But what happens in practical applications where, instead of a simulation, a human provides feedback to RL agents? Then the agent has to model the human’s decision-making process and adapt to it during training.

So far, most methods in RL assume that humans are noisily rational and, importantly, unbiased. We argue that this is an oversimplification and that RL algorithms need more realistic models of human interaction to be effective in real human-in-the-loop scenarios.

The paper reviews common approaches to handling human feedback in RL and discusses key research challenges and opportunities in three categories: personalization, contextualization, and adaptiveness.

Key Insights

Collaborative Human-AI Interaction

First, let’s review different types of human-AI interaction occurring in RL applications. For our discussion, we distinguish three categories of interaction: instruction, evaluation, and cooperation.

Instruction

A human can instruct the agent by telling it what to do. This is common in imitation or inverse reinforcement learning, where the agent learns a task from human demonstrations. For example, a human expert might teleoperate a robotic arm to generate these demonstrations. But instruction can also occur in more subtle forms, for example, if a human physically corrects a robotic arm’s trajectory. 

Evaluation

A human can provide evaluations by telling the agent how well it is doing. This is common in reinforcement learning from human feedback (RLHF), where humans, for example, decide which of multiple trajectories they prefer. There are also implicit forms of evaluation, for example, user engagement in recommender systems.

Cooperation

Many practical applications need more complicated forms of human-AI interactions, which are best modeled as cooperation. For example, a human worker and a robot might work together to assemble a product and learn from each other.

Challenges in Human Feedback Modelling

Current models of human feedback in RL are very simple. They usually assume a human user has fixed preferences and takes actions rationally with some constant noise model. Our paper highlights unsolved challenges to modeling real human feedback along three dimensions: feedback must be modeled as personal, contextual, and adaptive.

Personalized Feedback

Different humans will interact differently with RL systems, and we cannot hope to find a universal model of this interaction. The interaction will, for example, depend on personality factors and prior knowledge of the task and the system. To build accurate models of human feedback, we must take these factors into account.

Contextualized Feedback

Beyond personal factors, we must also take the (sociotechnical) context into account how a human interacts with an RL agent will depend on when and where the interaction is happening. For example, a medical doctor might be more careful about evaluating the recommendations of an AI assistant than the average user of a personal assistant on their smartphone. Such contextual effects also interact with personality effects and have to be modeled together.

Adaptive Feedback

During an interaction, both the user and the AI system will accumulate knowledge that changes how they interact with each other. Other factors, such as the user’s energy and level of motivation, might also change during the interaction. To accurately model human-AI interaction, we must measure, predict, and adapt to such changes.

Implications for RL

To enable RL systems to learn from and adapt to different people and contexts, we must model the interaction along these three dimensions and design RL systems accordingly.

We need to ensure the whole interaction is personalized and considers each user’s preferences. We must ensure that the system design is appropriate for the interaction context. And we need to measure and predict changes in the interaction and design systems that adapt to these changes.

Moving to more realistic models of human feedback will likely make the corresponding RL problems more difficult. However, a more realistic approach to modeling humans will make the RL algorithms we design more robust and applicable to practical situations.

Importantly, modeling human feedback is more than just a technical challenge. Some of the most important open research questions we discuss in our paper must be answered from a human-centered and interdisciplinary perspective.

Between the lines

Human-in-the-loop RL aims to design systems that can interact with real humans in real-world situations. Current research toward this goal is held back by using too simplistic human models. The research questions in this paper pose a difficult interdisciplinary research challenge. But solving this challenge will help us to move from RL systems that work well with simulated humans and in very basic user studies to RL systems that can interact with humans in the real world.

We plan to start an interdisciplinary discussion around these topics, which allows reinforcement learning researchers to look beyond the immediate technical problems they are solving and move towards designing and deploying human-centered RL systems.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

SAIER Volume 8 (2026)

SAIER Volume 8 (2026) Call for Contributors

🔍 SEARCH

Spotlight

Tech Futures: Introducing the Resist List

An abstract spiral of dark circles appears at the centre, resembling a tornado. Several vintage magazine covers and advertisements are being drawn toward the spiral. The artworks that have already been pulled into it are becoming distorted and replaced with clusters of numbers representing their numerical embeddings.

Tech Futures: Better Imagination for Better Tech Futures

This image is a collage with a colourful Japanese vintage landscape showing a mountain, hills, flowers and other plants and a small stream. There are 3 large black data servers placed in the bottom half of the image, with a cloud of black smoke emitting from them, partly obscuring the scenery.

Tech Futures: Crafting Participatory Tech Futures

A network diagram with lots of little emojis, organised in clusters.

Tech Futures: AI For and Against Knowledge

A brightly coloured illustration which can be viewed in any direction. It has many elements to it working together: men in suits around a table, someone in a data centre, big hands controlling the scenes and holding a phone, people in a production line. Motifs such as network diagrams and melting emojis are placed throughout the busy vignettes.

Tech Futures: The Fossil Fuels Playbook for Big Tech: Part II

related posts

  • The Canada Protocol: AI checklist for Mental Health & Suicide Prevention

    The Canada Protocol: AI checklist for Mental Health & Suicide Prevention

  • Research summary: Aligning Super Human AI with Human Behavior: Chess as a Model System

    Research summary: Aligning Super Human AI with Human Behavior: Chess as a Model System

  • Energy and Policy Considerations in Deep Learning for NLP

    Energy and Policy Considerations in Deep Learning for NLP

  • Research Summary: Countering Information Influence Activities: The State of the Art

    Research Summary: Countering Information Influence Activities: The State of the Art

  • Relative Behavioral Attributes: Filling the Gap between Symbolic Goal Specification and Reward Learn...

    Relative Behavioral Attributes: Filling the Gap between Symbolic Goal Specification and Reward Learn...

  • The AI Gambit – Leveraging Artificial Intelligence to Combat Climate Change

    The AI Gambit – Leveraging Artificial Intelligence to Combat Climate Change

  • Towards A Unified Utilitarian Ethics Framework for Healthcare Artificial Intelligence

    Towards A Unified Utilitarian Ethics Framework for Healthcare Artificial Intelligence

  • Jake Elwes: Constructing and Deconstructing Gender with AI-Generated Art

    Jake Elwes: Constructing and Deconstructing Gender with AI-Generated Art

  • How Helpful do Novice Programmers Find the Feedback of an Automated Repair Tool?

    How Helpful do Novice Programmers Find the Feedback of an Automated Repair Tool?

  • Why was your job application rejected: Bias in Recruitment Algorithms? (Part 2)

    Why was your job application rejected: Bias in Recruitment Algorithms? (Part 2)

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • © 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.