• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
    • Tech Futures
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • Volume 7 (November 2025)
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

Evaluating the Social Impact of Generative AI Systems in Systems and Society

February 14, 2025

🔬 Research Summary by ✍️ Usman Gohar and Zeerak Talat.

Usman Gohar is a Computer Science Ph.D. candidate at Iowa State University studying AI safety and Algorithmic Fairness.

Zeerak Talat is a Chancellor’s Fellow in Responsible Machine Learning and Artificial Intelligence at the University of Edinburgh who studies the ethics and politics of machine learning.

[Original Paper by Irene Solaiman, Zeerak Talat, William Agnew, Lama Ahmad, Dylan Baker, Su Lin Blodgett, Canyu Chen, Hal Daumé III, Jesse Dodge, Isabella Duan, Ellie Evans, Felix Friedrich, Avijit Ghosh, Usman Gohar, Sara Hooker, Yacine Jernite, Ria Kalluri, Alberto Lusoli, Alina Leidinger, Michelle Lin, Xiuzhu Lin, Sasha Luccioni, Jennifer Mickel, Margaret Mitchell, Jessica Newman, Anaelia Ovalle, Marie-Therese Png, Shubham Singh, Andrew Strait, Lukas Struppek, Arjun Subramonian]

Overview: Generative AI models across text, image, and video modalities have advanced rapidly, and research has highlighted their wide-ranging social impacts. Yet, no standard framework exists for evaluating these impacts or determining what should be assessed. Our framework identifies various categories of social impacts, such as bias, privacy, and environmental costs, while discussing evaluation methods tailored to these concerns. By analyzing limitations in current approaches and providing actionable recommendations, we aim to lay the groundwork for standardized, context-sensitive evaluations of generative AI systems.


Introduction

Understanding the social impacts of AI systems—from conception to deployment—requires examining factors like training data, model design, infrastructure, and societal context. It also involves analyzing how these systems influence societal processes, institutions, and power dynamics. Yet, many existing evaluations of generative AI systems are narrow, often overlooking critical aspects. For instance, did you know that generating an image with AI consumes as much energy as charging your smartphone? We define social impact as the effect of a system on people and society with a focus on active, measurable, harmful impacts. With the increasing use of generative AI systems for diverse tasks, social impact evaluations have become essential, yet no widely accepted standard exists. 

This work aims to advance a standardized approach by proposing a framework to evaluate the social impacts of generative AI systems across five modalities: text (including language and code), image, video, audio, and multimodal combinations. Convening in workshops, the authors propose a framework that distinguishes between “base systems” with no predetermined application and “people and society,” focusing on interactions with individuals and communities.

For base systems, we identify seven categories of social impact, including bias and privacy, while for societal context, we highlight five categories, such as trustworthiness and creativity. Our framework (see Figure 1) aims to improve understanding of social impact, informing appropriate use in diverse contexts. By offering both quantitative and qualitative insights, we seek to make social impact evaluations more accessible for researchers, developers, auditors, and policymakers. 

Figure 1: Evaluation Categories and Connections

Impacts: Technical Base Systems

Technical Base Systems refer to AI systems that have no predetermined application, including models and their components. This makes it harder to evaluate them as the scope of social impact harms can vary vastly and cannot be comprehensively captured. This category evaluates the various social harms the development of such systems have regardless of what they are used for. However, our framework is designed to be extensible and flexible and is proposed as a living document for this purpose.

For base systems, we identify the following seven high-level non-exhaustive categories:

  1. Bias, Stereotypes, and Representational Harms
  2. Cultural Values and Sensitive Content
  3. Disparate Performance
  4. Environmental Costs and Carbon Emissions
  5. Privacy and Data Protection
  6. Financial Costs
  7. Data and Content Moderation Labor

For each of these categories, we present a synthesis of the findings across different modalities in the paper, including what needs to be evaluated and the current limitations of Generative AI evaluation practices. For example, some current evaluations overfit certain lenses and geographies, such as evaluating a multilingual system only in the English language (see the full paper for further discussion and nuances of social impact).

Limitations:

In the paper, the authors find that several overarching challenges emerge when evaluating generative AI systems across these categories, reflecting common issues. One of the consistent issues plaguing current evaluations is the lack of transparency. Across all categories, there is insufficient documentation, whether it pertains to labor conditions, resource use, or privacy practices.

For example, accurate estimation of environmental costs is currently hamstrung by a lack of information on energy consumption from equipment manufacturers and data/hosting centers. This lack of clarity hinders accountability and informed evaluations.

Next, it was commonly identified that there is insufficient participation of marginalized communities, often the ones most affected by AI systems, in the design and evaluation processes. This leads to frameworks that inadequately reflect their needs and perspectives.

Finally, the contextual nature of AI systems—spanning cultural, societal, and technical environments—makes standardized evaluations difficult. Nuances such as intersectionality, cultural diversity, and regional differences often require localized approaches, which are underdeveloped. We refer the readers to the full paper for contextualized and technical limitations of current practices for individual categories.

Impacts: People and Society

In contrast to the technical base system, the social impact evaluations focus on the impact of AI systems and what can be evaluated in the interactions of generative AI systems with people and society. These impacts cannot be measured in isolation as these are born out of the interaction of society with these systems. Such evaluations examine the systems in the broader societal context, e.g.,  trust in model outputs (fact-checking), loss of jobs, etc. The scale and scope of generative AI technologies necessarily mean they interact with national and global social systems, including economies, politics, and cultures.

The following non-exhaustive categories were identified, which are heavily influenced by the deployment environment:

  1. Trustworthiness and Autonomy
  2. Inequality, Marginalization, and Violence
  3. Concentration of Authority
  4. Labor and Creativity
  5. Ecosystem and Environment

Summary of Major Concerns:

A broad range of contextual concerns were identified across the five categories (see full paper). Here, we discuss some common threads.

Equity and Access: Generative AI’s advantages, such as enhanced productivity and innovation, are not evenly distributed. Wealthier nations, industries, and individuals gain the most, leaving marginalized and low-resource groups at a disadvantage. Secondly, high resource costs (e.g., compute power) exclude underrepresented communities, researchers, and smaller organizations from meaningful participation.

Transparency and Accountability: The lack of clarity about training data, ownership, and operational decisions raises concerns about fairness and intellectual property violations, such as reputational damage and economic loss.

Concentration of Authority: Generative AI can enhance cyberattacks, disinformation campaigns, and surveillance, concentrating power among a few actors, often under non-transparent terms. Moreover, models trained in one cultural context may impose external norms, marginalizing local languages, values, and identities.

Between the lines

Evaluating generative AI systems requires more than technical assessments—it demands understanding their societal impacts and the context of their deployment. Our findings emphasize the critical need for inclusive, context-specific evaluations considering diverse cultural values, marginalized communities, and overlooked regions.

Despite progress, gaps remain. Social impact evaluations often lack depth and fail to adequately center the least powerful in society. Moreover, the absence of standardized, universally applicable evaluation frameworks leaves critical questions unanswered: How do we define harm across contexts? How can evaluations adapt to evolving cultural norms? Other open-ended questions also remain unanswered: How distinct should frameworks and suites be on specific components of AI safety? How can we incentivize progress when we’ll never “solve” many of these categories? How should conflicting values be reckoned? What should be prioritized? The authors echo calls for a community-driven approach that involves the breadth of the stakeholders.

These gaps point to urgent directions for future research. There is a need to create standardized methods that balance technical rigor with ethical considerations, ensuring transparency and inclusivity. Collaboration across stakeholders is essential to craft evaluations that measure performance and promote fairness and equity in AI deployment.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

An abstract spiral of dark circles appears at the centre, resembling a tornado. Several vintage magazine covers and advertisements are being drawn toward the spiral. The artworks that have already been pulled into it are becoming distorted and replaced with clusters of numbers representing their numerical embeddings.

Tech Futures: Better Imagination for Better Tech Futures

This image is a collage with a colourful Japanese vintage landscape showing a mountain, hills, flowers and other plants and a small stream. There are 3 large black data servers placed in the bottom half of the image, with a cloud of black smoke emitting from them, partly obscuring the scenery.

Tech Futures: Crafting Participatory Tech Futures

A network diagram with lots of little emojis, organised in clusters.

Tech Futures: AI For and Against Knowledge

A brightly coloured illustration which can be viewed in any direction. It has many elements to it working together: men in suits around a table, someone in a data centre, big hands controlling the scenes and holding a phone, people in a production line. Motifs such as network diagrams and melting emojis are placed throughout the busy vignettes.

Tech Futures: The Fossil Fuels Playbook for Big Tech: Part II

A rock embedded with intricate circuit board patterns, held delicately by pale hands drawn in a ghostly style. The contrast between the rough, metallic mineral and the sleek, artificial circuit board illustrates the relationship between raw natural resources and modern technological development. The hands evoke human involvement in the extraction and manufacturing processes.

Tech Futures: The Fossil Fuels Playbook for Big Tech: Part I

related posts

  • Episodio 3 - Idoia Salazar: Sobre la Vital Importancia de Educar al Ciudadano en los Usos Responsabl...

    Episodio 3 - Idoia Salazar: Sobre la Vital Importancia de Educar al Ciudadano en los Usos Responsabl...

  • Facial Recognition - Can It Evolve From A “Source of Bias” to A “Tool Against Bias”

    Facial Recognition - Can It Evolve From A “Source of Bias” to A “Tool Against Bias”

  • Digital Sex Crime, Online Misogyny, and Digital Feminism in South Korea

    Digital Sex Crime, Online Misogyny, and Digital Feminism in South Korea

  • The TESCREAL Bundle: Eugenics and the promise of utopia through artificial general intelligence

    The TESCREAL Bundle: Eugenics and the promise of utopia through artificial general intelligence

  • Distributed Governance: a Principal-Agent Approach to Data Governance - Part 1 Background & Core Def...

    Distributed Governance: a Principal-Agent Approach to Data Governance - Part 1 Background & Core Def...

  • Mapping the Responsible AI Profession, A Field in Formation (techUK)

    Mapping the Responsible AI Profession, A Field in Formation (techUK)

  • Ten Simple Rules for Good Model-sharing Practices

    Ten Simple Rules for Good Model-sharing Practices

  • Why AI ethics is a critical theory

    Why AI ethics is a critical theory

  • Research summary: Machine Learning Fairness - Lessons Learned

    Research summary: Machine Learning Fairness - Lessons Learned

  • The State of Artificial Intelligence in the Pacific Islands

    The State of Artificial Intelligence in the Pacific Islands

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • © 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.