• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • Volume 7 (November 2025)
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

Whose AI Dream? In search of the aspiration in data annotation.

July 29, 2023

馃敩 Research Summary by Ding Wang, a senior researcher from the Responsible AI Group in Google Research, specializing in responsible data practices with a specific focus on accounting for the human experience and perspective in data production.

[Original paper by Ding Wang, Shantanu Prabhat, and Nithya Sambasivan]


Overview: This paper delves into the crucial role of annotators in developing AI systems, exploring their perspectives, aspirations, and ethical considerations surrounding their work. It offers valuable insights into the human element within AI and the impact annotators have on shaping the future of artificial intelligence.聽


Introduction

This research provides valuable insights into the experiences of data annotators in India, who have played a significant role in developing AI systems worldwide. The study involved interviews with 25 data annotators working for third-party annotation companies. It sheds light on their daily work and provides a glimpse into their aspirations. Despite holding undergraduate degrees in STEM subjects, the annotators found themselves attracted to data annotation as part of their involvement in AI development, even though their initial dreams may have been to become machine learning engineers. The research highlights that despite transitioning from platform work to organized employment, the job still carries a sense of precarity.

Key Insights

While data annotation was historically done on crowd-work platforms, there has been a rise in private annotation firms employing full-time workers. These firms provide various services beyond annotation, such as project management and quality control. However, the human labor involved in data annotation remains under-recognized compared to the market value of the annotation. The paper focuses on the work practices and experiences of data annotators in India, revealing challenges such as work pressure, lack of career progression, and precarity despite professionalization. The findings contribute to understanding the organization of data annotation work and its implications for stakeholders, particularly the annotators. 

The paper addresses the significance of crowd-sourcing platforms employing numerous workers for essential AI model training tasks. It highlights the challenges related to enforcing norms and cultural sensitivity in annotation work, emphasizing the hidden nature of this labor. Smaller platforms prioritize worker training and local expertise, favoring project-oriented contractual hiring and individual expertise. The evolving landscape of annotation work intertwines with the gig economy. Despite efforts to document sociological aspects of data and promote ethical AI practices, the focus tends to neglect the practice of data annotation and the role of data workers. However, recent regulations in China acknowledge the importance of protecting workers’ rights. The paper asserts that data annotation practices should be integral to ethical and responsible AI discussions.

Becoming an annotator

The recruitment process for annotators in the study revealed a common practice of requiring high educational qualifications, such as undergraduate degrees in technology and engineering. Previous experience in annotation was not necessary and was even seen as a disadvantage, as it would lead to higher salary expectations. Referrals played a significant role in finding employment, with friends, classmates, or alumni referring nearly half of the annotators. The rapid growth of annotation companies and the promise of a bright future in the technology industry, particularly in autonomous vehicles, attracted participants to become annotators. Job advertisements portrayed annotation as a well-paid and prestigious part of the AI industry, reinforcing the AI dream narrative. The interview process for annotation positions included technical assessments, adding complexity and a sense of technicality to the role.

Training for annotators involved orientation training and pre-project training. Orientation training focused on familiarizing annotators with tools and processes but often lacked the connection between annotation and AI. Pre-project training provided specific instructions on datasets and guidelines. However, tight deadlines sometimes led to skipping pre-project training, impacting knowledge transfer and annotation quality.

The emphasis on client satisfaction and fast delivery overshadowed annotators’ training needs and interests. Data quality was primarily measured by accuracy rates, prioritizing client requests. The annotators discovered shortcuts and techniques through experience, improving productivity but not formally included in the training.

Overall, the recruitment, training, and work conditions reflected the rapid growth and demand in the annotation industry, highlighting the importance of education, referrals, and better alignment between training and annotators’ needs.

Being an annotator

Annotators worked long hours, often exceeding official reports, without compensation for the extra work. They typically started the day with a status check led by team or project leads, reviewing pending tasks and receiving data to label. They used company-issued laptops with limited access to relevant websites and software. Communication tools like Microsoft Teams, Google Chat, and WhatsApp (in some cases) were essential for work-related queries. Access to unauthorized websites and software was blocked. Annotators had experience with various annotation tools, including in-house and open-source ones.

Target-setting for annotators was determined by team leads or project managers based on experience or average completion rates. Targets could increase over time, and annotators were expected to meet them without negotiation. Analysts and managers conducted quality control checks to ensure high-quality data delivery. Accuracy rates, often around 98% or aiming for zero errors, were monitored through automated and manual checks. Annotators reported tool issues through separate channels, with resolutions and client communications determined by company hierarchy.

The data annotation process extends beyond technical aspects, involving organizational structure and power dynamics. Annotation companies prioritize high-quality data delivery at a low cost, with strict work quality monitoring. Annotators navigate these dynamics while striving to meet targets and deliver accurate annotations.

An annotator鈥檚 aspiration 

Data annotation is often seen as a stepping stone to a career in AI and ML. However, the skills and experiences gained in annotation do not necessarily translate to technical roles. The concept of expertise in annotation is unclear, and breaking into more technical positions can be challenging. Promotion opportunities within annotation are limited, and the retention rate is low, with annotators typically staying for 12-18 months. The compensation system does not incentivize annotators to stay; job stability is uncertain. The pandemic has further exacerbated job insecurity, with many annotators experiencing job losses. Despite the challenges, annotators take pride in their contribution to AI and ML, recognizing the importance of data annotation in these fields. However, the current state of data annotation highlights the irony of overqualified individuals performing repetitive tasks to support the development of AI technologies.

In conclusion, Whose AI Dream?

Data annotation is a crucial process performed by full-time annotators within well-structured working environments. While the third-party annotation industry has grown alongside AI and ML systems, our research shows that individual annotators have not reaped their benefits. Annotation companies provide defined roles and hierarchies to handle complex tasks and meet client expectations, yet this rigid structure limits data interpretation and stifles annotators’ skills and perspectives. Performance metrics narrowly define data quality, overlooking annotators’ unique contributions. Annotators seldom question power dynamics, accepting predetermined notions of success. To support their aspirations, stakeholders must understand the annotation’s context and consider design implications. Our study reveals limited stability and career progression within annotation, hindering annotators from pursuing technical roles. Recognizing expertise beyond educational credentials is crucial. Articulating fundamental annotation skills and providing appropriate training can enhance annotators’ employability. Promoting career growth within the industry requires shared knowledge, ML/AI systems exposure, and ethical practices. Regulatory discussions address fair working conditions while documenting data labor practices informs policy improvements. Annotators’ expertise, adaptability, and responsiveness should be valued, positioning them as agile experts. Collaborative efforts among researchers, practitioners, legal experts, and policymakers are vital for systemic changes prioritizing annotators’ well-being and career development in the AI and ML field.

Between the lines

Despite the work that came out a year ago, the relevance of it, instead of decreasing, has increased on the backdrop of the release of large language models such as ChatGPT and Bard. Several articles discuss the working conditions of the annotators, and their experience came out this year. We have a fuller picture of the lives of annotators across the world. Yet, changes in how we as an industry engage with the annotators, their work, their perspectives, and their aspirations have yet to be followed up.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

馃攳 SEARCH

Spotlight

ALL IN Conference 2025: Four Key Takeaways from Montreal

Beyond Dependency: The Hidden Risk of Social Comparison in Chatbot Companionship

AI Policy Corner: Restriction vs. Regulation: Comparing State Approaches to AI Mental Health Legislation

Beyond Consultation: Building Inclusive AI Governance for Canada’s Democratic Future

AI Policy Corner: U.S. Executive Order on Advancing AI Education for American Youth

related posts

  • Europe : Analysis of the Proposal for an AI Regulation

    Europe : Analysis of the Proposal for an AI Regulation

  • (Re)Politicizing Digital Well-Being: Beyond User Engagements

    (Re)Politicizing Digital Well-Being: Beyond User Engagements

  • Exploring Antitrust and Platform Power in Generative AI

    Exploring Antitrust and Platform Power in Generative AI

  • Ghosting the Machine: Judicial Resistance to a Recidivism Risk Assessment Instrument

    Ghosting the Machine: Judicial Resistance to a Recidivism Risk Assessment Instrument

  • Understanding the Effect of Counterfactual Explanations on Trust and Reliance on AI for Human-AI Col...

    Understanding the Effect of Counterfactual Explanations on Trust and Reliance on AI for Human-AI Col...

  • Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

    Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

  • Customization is Key: Four Characteristics of Textual Affordances for Accessible Data Visualizatio...

    "Customization is Key": Four Characteristics of Textual Affordances for Accessible Data Visualizatio...

  • Montreal AI Ethics Institute Hosts a TechAIDE Caf茅 Session

    Montreal AI Ethics Institute Hosts a TechAIDE Caf茅 Session

  • Why We Need to Audit Government AI

    Why We Need to Audit Government AI

  • Code Work: Thinking with the System in Mexico

    Code Work: Thinking with the System in Mexico

Partners

  • 聽
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • 漏 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.