• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Core Principles of Responsible AI
    • Accountability
    • Fairness
    • Privacy
    • Safety and Security
    • Sustainability
    • Transparency
  • Special Topics
    • AI in Industry
    • Ethical Implications
    • Human-Centered Design
    • Regulatory Landscape
    • Technical Methods
  • Living Dictionary
  • State of AI Ethics
  • AI Ethics Brief
  • 🇫🇷
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

Participation and Division of Labor in User-Driven Algorithm Audits: How Do Everyday Users Work together to Surface Algorithmic Harms?

August 22, 2023

🔬 Research Summary by Sara Kingsley, a researcher at Carnegie Mellon University, and an expert in A.I. system risk assessments, having built A.I. auditing tools, as well as red teamed multiple generative A.I. systems for different technology companies.

[Original paper by Rena Li, Sara Kingsley, Chelsea Fan, Proteeti Sinha, Nora Wai, Jaimie Lee, Hong Shen, Motahhare Eslami, and Jason Hong.]


Overview: There is an emerging phenomenon of people organically coming together to audit AI systems for bias. This paper showcases four cases of user-driven A.I. audits to offer critical lessons about what type of labor might get authorities to address the risks of different A.I. systems. Equally, the paper calls on stakeholders to examine how users wish authorities to respond to their reports of societal harm. 


Introduction

The White House, technology companies, and researchers have called for auditors to assess the potential and unknown risks of A.I. systems, but should we necessarily only rely on experts and scientific methods to unearth the risks of A.I.? Everyday users have a wealth of knowledge, also from their lived experience and interactions with automated decision and content generation systems, including those currently making critical decisions about our lives. However, not much is known about how everyday users audit A.I. systems — what tactics do they use? What is their division of labor? Based on our investigation of participation in and the labor of everyday user audits, we present our results on how users document A.I. biases. Our findings have implications for developing tools, systems, and frameworks (including for A.I. governance) to support people in addressing A.I. biases. 

Key Insights

The Four User-Driven Auditing Cases 

In our paper, “Participation and Division of Labor in User-Driven Algorithm Audits,” presented at the Association for Computing Machinery Conference on Human Factors in Computing (CHI), we report common roles and the patterns of engagement users displayed in four different audits of A.I. systems. The first case, the Twitter Image Cropping case, involved users auditing a feature on Twitter that automatically resized and centered images that users uploaded to the social media platform. Twitter users claimed the cropping algorithm was racist after they noticed it tended to center white people, removing Black people from view. 

The second case, ImageNet Roulette, involved users testing out an algorithm that automatically applied labels (e.g., secretary, software developer, etc.) to images they uploaded to a website. Notably, the ImageNet Roulette website had been designed as an art project to raise peoples’ awareness of the biases that the underlying dataset (and the algorithms built on it) could produce. 

The third case, the Apple Card case, involved Twitter users sharing the outcome of their applications to receive financial credit from Goldman Sachs (e.g., the Apple credit card). After a tweet by a famous person called attention to the matter, users questioned whether Goldman Sachs approved women for less credit than men; their outcries on Twitter eventually led the New York state government to audit Goldman Sachs. 

Finally, the fourth case involved Portrait AI, an app enabling users to upload their ‘selfie’ photos, which the app would then automatically transform into 19th-century portraits. On Twitter, a few users noted they felt the Portrait AI app changed the race of people, erasing the representation of users from marginalized demographics. Unlike the other three user audits, though, most users sharing their Portrait AI-generated self-portraits did not claim the app was biased, possibly because there was a lack of bias awareness since neither news media nor celebrities on Twitter had reported the algorithm was possibly biased. 

These four user-driven auditing cases unearthed critical and unexplored lessons about how everyday people perceive and assess the societal risks and potential harms of A.I. systems. Particularly, these audits included generative A.I. image systems and systems designed for content classification and resizing, e.g., automated tasks that might heavily influence the output of other A.I. systems.

Many Blasting One Tweet About A.I. Risks Might Do More Than A Comprehensive Statistical Audit

Unlike comprehensive statistical audits, where typically a few experts invest a lot of labor and resources toward analyzing datasets over a long period of time, in three user auditing cases (i.e., Twitter Cropping, ImageNet Roulette, Apple Card), we found user participation in various auditing activities typically surged immediately after initial reports of A.I. system risks circulated on Twitter. 

We found user participation also defied common notions about Slacktivism (e.g., the idea being people blast off a few tweets about a social issue, but this does not lead authorities to address the problem). In three auditing cases, users typically contributed only a single tweet. However, in these cases, the algorithm operator or a government authority responded by auditing the algorithm in question. Thus, users are generating collective awareness and consensus that some authority should address the biases of an A.I. system by amplifying and spreading the word among their followers. By doing so, many users sharing only one tweet can lead to an intervention.

Discovering Ways to Help Crowds of People Conduct User-Driven Audits

By engaging in conversation threads on Twitter, our paper shows users built on their own and others’ hypotheses, evidence, and techniques for auditing A.I. biases. In this way, we believe crowds of people sharing information about A.I. behaviors and auditing activities offer an important way for communities to discover A.I. biases. 

A long line of prior research even suggests the collective intelligence of crowds of people sharing and making sense of information can perform as well as experts. Our paper calls attention to additional ways of investigating A.I. biases by supporting crowds of people conducting user-driven audits. Particularly, our paper suggests crowds of people auditing A.I. biases could benefit from tools specifically designed to support user-driven audits. 

For one, in the user-driven auditing cases we investigated, each case had several hundred unique conversations, and this highlighted a need to develop tools to help users share information in a centralized space, as this would enable users to more easily access and build on the works of others. To be clear, we found many distinct conversation threads about the same A.I. bias case, indicating that access to all the information shared by users was fragmented and decentralized. As such, our paper suggests the fragmented nature of information sharing might have made it difficult for users to understand the complete history and evidence generated about a particular A.I. bias case. Similarly, our paper discusses how crowds of people conducting user-driven audits might benefit from tools enabling them to compare different sets of evidence.

Between the lines

Along these lines, we found access to algorithmic systems is important for enabling and supporting user-driven auditing. For example, the Twitter Image Cropping, ImageNet Roulette, and Portrait A.I. algorithms were easy for users to test; they simply uploaded their images to the platforms and then shared the algorithm response with other users on Twitter. In contrast, the algorithm in the Apple Card case was not accessible, users were not interacting with it directly, but instead, if married, they were submitting their information in pairs through a web form and then observing the credit limit each person was approved to receive. A breadth of prior research has documented that A.I. biases (e.g., gender bias in credit) can take different forms for different demographics (e.g., unmarried, married women and non-binary people). Providing access to software for users to test for A.I. biases directly is critical for understanding A.I. biases comprehensively. 

Ultimately, from these and additional findings, our paper shows the benefits of studying how crowds of people already audit for A.I. biases: users can help us understand how to design better tools and ways to support A.I. assessments.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

AI Policy Corner: Singapore’s National AI Strategy 2.0

AI Governance in a Competitive World: Balancing Innovation, Regulation and Ethics | Point Zero Forum 2025

AI Policy Corner: Frontier AI Safety Commitments, AI Seoul Summit 2024

AI Policy Corner: The Colorado State Deepfakes Act

Special Edition: Honouring the Legacy of Abhishek Gupta (1992–2024)

related posts

  • A Holistic Assessment of the Reliability of Machine Learning Systems

    A Holistic Assessment of the Reliability of Machine Learning Systems

  • Research summary: Using Multimodal Sensing to Improve Awareness in Human-AI Interaction

    Research summary: Using Multimodal Sensing to Improve Awareness in Human-AI Interaction

  • The Unnoticed Cognitive Bias Secretly Shaping the AI Agenda

    The Unnoticed Cognitive Bias Secretly Shaping the AI Agenda

  • AI Ethics Maturity Model

    AI Ethics Maturity Model

  • How Artifacts Afford: The Power and Politics of Everyday Things

    How Artifacts Afford: The Power and Politics of Everyday Things

  • Do Large GPT Models Discover Moral Dimensions in Language Representations? A Topological Study Of Se...

    Do Large GPT Models Discover Moral Dimensions in Language Representations? A Topological Study Of Se...

  • Designing a Future Worth Wanting: Applying Virtue Ethics to Human–Computer Interaction

    Designing a Future Worth Wanting: Applying Virtue Ethics to Human–Computer Interaction

  • De-platforming disinformation: conspiracy theories and their control

    De-platforming disinformation: conspiracy theories and their control

  • Defining a Research Testbed for Manned-Unmanned Teaming Research

    Defining a Research Testbed for Manned-Unmanned Teaming Research

  • CodeAid: Evaluating a Classroom Deployment of an LLM-based Programming Assistant that Balances Stude...

    CodeAid: Evaluating a Classroom Deployment of an LLM-based Programming Assistant that Balances Stude...

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer

Categories


• Blog
• Research Summaries
• Columns
• Core Principles of Responsible AI
• Special Topics

Signature Content


• The State Of AI Ethics

• The Living Dictionary

• The AI Ethics Brief

Learn More


• About

• Open Access Policy

• Contributions Policy

• Editorial Stance on AI Tools

• Press

• Donate

• Contact

The AI Ethics Brief (bi-weekly newsletter)

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.


Archive

  • © MONTREAL AI ETHICS INSTITUTE. All rights reserved 2024.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.