• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
    • Tech Futures
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • Volume 7 (November 2025)
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

Research summary: Bring the People Back In: Contesting Benchmark Machine Learning

September 14, 2020

Summary contributed by our researcher Alexandrine Royer, who works at The Foundation for Genocide Education.

*Authors of full paper & link at the bottom


Mini-summary: The biases present in machine learning datasets, which revealed themselves to favour white, cisgender, male and Western subjects, have received a considerable amount of scholarly attention. Denton et al. argue that the scientific community has failed to consider the histories, values, and norms that construct and pervade such datasets. The authors intend to create a research program, what they termed the genealogy of machine learning, that works to understand how and why such datasets are created. By turning our attention to data collection, and specifically the labour involved in dataset creation, we can “bring the people back in” the machine learning process. For Denton et al., understanding the labour embedded in the dataset will push researchers to critically reflect on the type and origin of the data they are using and thereby contest some of its applications.

Full summary:

In recent years, industry and non-industry members have decried the prevalence of biased datasets against people of colour, women, LGBTQ+ communities, people with disabilities, and the working class within AI algorithms and machine learning systems. Due to societal backlash, data scientists have concentrated on adjusting the outputs of these systems. Fine-tuning algorithms to achieve “fairer results” have prevented, according to Denton et al., data scientists from questioning the data infrastructure itself, especially when it comes to benchmarks datasets. 

The authors point to how new forms of algorithmic fairness interventions generally center on the parity of representation between different demographic groups within the training datasets. They argue that such interventions fail to consider the issues present within data collection, which can involve exploitative mechanisms. Academics and industry members alike tend to disregard the question of why such datasets are created. Factors such as what and whose values are determining the type of data collected, in what conditions are the collection being done, and whether standard data collection norms are appropriate often escape data scientists. For Denton et al., data scientists and data practitioners ought to work to “denaturalize” the data infrastructure, meaning to uncover the assumptions and values that underlie prominent ML datasets. 

Taking inspiration from French philosopher Michel Foucault, the authors offer the first step what they termed the “genealogy” of machine learning. For a start, data and social scientists should trace the histories of prominent datasets, the modes of power as well as the unspoken labour that went into its creation. Labelling within datasets is organized through a particular categorical schema, but it is seen as widely applicable, even for models with different success metrics. Benchmarking datasets are treated as gold standards for machine learning evaluation and comparison, leading them to take on an authoritative status. Indeed, as summarized by the authors, “once a dataset is released and established enough to seamlessly support research and development, their contingent conditions of creation tend to be lost or taken for granted.” 

Once datasets achieve this naturalized status, they are perceived as natural and scientific objects and, therefore, can be used within multiple institutions or organizations.  Publicly available research datasets, constructed in an academic context, often provide the methodological backbone (i.e. infrastructure) for several industry-oriented AI tools. Despite the disparities in the amount of data collected, industry machine learners will still rely on these datasets to undergird the material research in commercial AI. Technological companies treat these shifts are merely changes in scale and rarely in kind. 

To reverse the taken-for-granted status of benchmark datasets, the authors offer four guiding research questions: 

  1. How do datasets developers in machine learning research describe and motivate the decisions that go into their creation? 
  2. What are the histories and contingent conditions of the creation of benchmark datasets in machine learning? As an example, the authors offer the case of Henrietta Lacks, an Afro-American woman whose cervical cancer cells were removed from her body without her consent before her death. 
  3. How do benchmark datasets become authoritative, and how does this impact research practice?
  4. What are the current work practices, norms, and routines that structure data collection, curation, and annotation of data in machine learning? 

The research questions offered by Denton et al. are a good start in encouraging machine learners to think critically as to whether their dataset is aligned with ethical principles and values. Any investigation into the history of science will quickly reveal how data-gathering operations are often part of predatory and exploitative behaviours, especially towards minority groups who have little recourse to contest these practices. Data science should not be treated as an exception to this long-standing historical trend.  The creators of data collection should merit as much ethical consideration as the subjects that form this data. By critically investigating the work practices of technical experts, we can begin to demand greater accountability and contestability in the development of benchmark datasets.


Original paper by Emily Denton, Alex Hanna, Razvan Amironesi, Andrew Smart, Hilary Nicole, Morgan Klaus Scheuerman: https://arxiv.org/abs/2007.07399

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

An abstract spiral of dark circles appears at the centre, resembling a tornado. Several vintage magazine covers and advertisements are being drawn toward the spiral. The artworks that have already been pulled into it are becoming distorted and replaced with clusters of numbers representing their numerical embeddings.

Tech Futures: Better Imagination for Better Tech Futures

This image is a collage with a colourful Japanese vintage landscape showing a mountain, hills, flowers and other plants and a small stream. There are 3 large black data servers placed in the bottom half of the image, with a cloud of black smoke emitting from them, partly obscuring the scenery.

Tech Futures: Crafting Participatory Tech Futures

A network diagram with lots of little emojis, organised in clusters.

Tech Futures: AI For and Against Knowledge

A brightly coloured illustration which can be viewed in any direction. It has many elements to it working together: men in suits around a table, someone in a data centre, big hands controlling the scenes and holding a phone, people in a production line. Motifs such as network diagrams and melting emojis are placed throughout the busy vignettes.

Tech Futures: The Fossil Fuels Playbook for Big Tech: Part II

A rock embedded with intricate circuit board patterns, held delicately by pale hands drawn in a ghostly style. The contrast between the rough, metallic mineral and the sleek, artificial circuit board illustrates the relationship between raw natural resources and modern technological development. The hands evoke human involvement in the extraction and manufacturing processes.

Tech Futures: The Fossil Fuels Playbook for Big Tech: Part I

related posts

  • Disaster City Digital Twin: A Vision for Integrating Artificial and Human Intelligence for Disaster ...

    Disaster City Digital Twin: A Vision for Integrating Artificial and Human Intelligence for Disaster ...

  • Research Summary: Toward Fairness in AI for People with Disabilities: A Research Roadmap

    Research Summary: Toward Fairness in AI for People with Disabilities: A Research Roadmap

  • A Sequentially Fair Mechanism for Multiple Sensitive Attributes

    A Sequentially Fair Mechanism for Multiple Sensitive Attributes

  • Human-centred mechanism design with Democratic AI

    Human-centred mechanism design with Democratic AI

  • Putting AI ethics to work: are the tools fit for purpose?

    Putting AI ethics to work: are the tools fit for purpose?

  • Mapping value sensitive design onto AI for social good principles

    Mapping value sensitive design onto AI for social good principles

  • Towards Sustainable Conversational AI

    Towards Sustainable Conversational AI

  • Research summary: Technology-Enabled Disinformation: Summary, Lessons, and Recommendations

    Research summary: Technology-Enabled Disinformation: Summary, Lessons, and Recommendations

  • From Instructions to Intrinsic Human Values - A Survey of Alignment Goals for Big Models

    From Instructions to Intrinsic Human Values - A Survey of Alignment Goals for Big Models

  • Disability, Bias, and AI (Research Summary)

    Disability, Bias, and AI (Research Summary)

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • © 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.