• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Core Principles of Responsible AI
    • Accountability
    • Fairness
    • Privacy
    • Safety and Security
    • Sustainability
    • Transparency
  • Special Topics
    • AI in Industry
    • Ethical Implications
    • Human-Centered Design
    • Regulatory Landscape
    • Technical Methods
  • Living Dictionary
  • State of AI Ethics
  • AI Ethics Brief
  • 🇫🇷
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

Research summary: Learning to Diversify from Human Judgments – Research Directions and Open Challenges

May 14, 2020

Mini summary (scroll down for full summary):

Current algorithmic techniques frame the notion of diversity in the sense of using the presence of sensitive attributes in the result set as a measurement for whether there is sufficient representation. Yet, such an approach often ends up stripping these sensitive attributes, often gender and race from their deep social, culture and context specific meanings and bucket them into discrete categories that are rigid, uni-dimensional, and determined algorithmically in the process of clustering. 

The paper (by Denton et al.) presents a research direction using the concept of determinantal point process (DPP) as a mechanism for capturing diversity in a more subjective and individualized manner by taking in the feelings of the individuals on whether they think they are well represented in the result set or not. It tends to cluster together the things that the individual feels represents them well and further away from others that don’t in an embedding space. Relying on individual’s perceptions to tailor these representations moves applications a step forward in a direction where representation is adequately captured. The authors do identify challenges associated with this approach namely the reliable sourcing of this information in a large-scale manner, especially as it relates to the limitations of how crowdsourcing platforms are structured today but still gives the research community some food for thought in how to capture diversity better.


Full summary:

Ranking and retrieval systems for presenting content to consumers are geared towards enhancing user satisfaction, as defined by the platform companies which usually entails some form of profit-maximization motive, but they end up reflecting and reinforcing societal biases, disproportionately harming the already marginalized. 

In fairness techniques applied today, the outcomes are focused on the distributions in the result set and the categorical structures and the process of associating values with the categories is usually de-centered. Instead, the authors advocate for a framework that does away with rigid, discrete, and ascribed categories and looks at subjective ones derived from a large pool of diverse individuals. Focusing on visual media, this work aims to bust open the problem of underrepresentation of various groups in this set that can render harm on to the groups by deepening social inequities and oppressive world views. Given that a lot of the content that people interact with online is governed by automated algorithmic systems, they end up influencing significantly the cultural identities of people. 

While there are some efforts to apply the notion of diversity to ranking and retrieval systems, they usually look at it from an algorithmic perspective and strip it of the deep cultural and contextual social meanings, instead choosing to reference arbitrary heterogeneity. Demographic parity and equalized odds are some examples of this approach that apply the notion of social choice to score the diversity of data. Yet, increasing the diversity, say along gender lines, falls into the challenge of getting the question of representation right, especially trying to reduce gender and race into discrete categories that are one-dimensional, third-party and algorithmically ascribed. 

The authors instead propose sourcing this information from the individuals themselves such that they have the flexibility to determine if they feel sufficiently represented in the result set. This is contrasted with the degree of sensitive attributes that are present in the result sets which is what prior approaches have focused on. From an algorithmic perspective, the authors advocate for the use of a technique called determinantal point process (DPP) that assigns a higher probability score to sets that have higher spreads based on a predefined distance metric. 

How DPP works is that for items that the individual feels represents them well, the algorithm clusters those points closer together, for points that they feel don’t represent them well, it moves those away from the ones that represent them well in the embedding space. Optimizing for the triplet loss helps to achieve the goals of doing this separation. 

But, the proposed framework still leaves open the question of sourcing in a reliable manner these ratings from the individuals about what represents and doesn’t represent them well and then encoding them in a manner that is amenable to being learned by an algorithmic system. 

While large-scale crowdsourcing platforms which are the norm in seeking such ratings in the machine learning world, given that their current structuring precludes raters’ identities and perceptions from consideration, this framing becomes particularly challenging in terms of being able to specify the rater pool. Nonetheless, the presented framework provides an interesting research direction such that we can obtain more representation and inclusion in the algorithmic systems that we build.


Original piece by Denton et al.: https://drive.google.com/file/d/1lPynepBWoldRH6TS_a2UgOLu3y_QBDWs/view

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

AI Policy Corner: Singapore’s National AI Strategy 2.0

AI Governance in a Competitive World: Balancing Innovation, Regulation and Ethics | Point Zero Forum 2025

AI Policy Corner: Frontier AI Safety Commitments, AI Seoul Summit 2024

AI Policy Corner: The Colorado State Deepfakes Act

Special Edition: Honouring the Legacy of Abhishek Gupta (1992–2024)

related posts

  • Humans are not Boltzmann Distributions: Challenges and Opportunities for Modelling Human Feedback an...

    Humans are not Boltzmann Distributions: Challenges and Opportunities for Modelling Human Feedback an...

  • Mapping AI Arguments in Journalism and Communication Studies

    Mapping AI Arguments in Journalism and Communication Studies

  • Who will share Fake-News on Twitter? Psycholinguistic cues in online post histories discriminate bet...

    Who will share Fake-News on Twitter? Psycholinguistic cues in online post histories discriminate bet...

  • Fine-Grained Human Feedback Gives Better Rewards for Language Model Training

    Fine-Grained Human Feedback Gives Better Rewards for Language Model Training

  • Experimenting with Zero-Knowledge Proofs of Training

    Experimenting with Zero-Knowledge Proofs of Training

  • Does Military AI Have Gender? Understanding Bias and Promoting Ethical Approaches in Military Applic...

    Does Military AI Have Gender? Understanding Bias and Promoting Ethical Approaches in Military Applic...

  • The Role of Arts in Shaping AI Ethics

    The Role of Arts in Shaping AI Ethics

  • Equal Improvability: A New Fairness Notion Considering the Long-term Impact

    Equal Improvability: A New Fairness Notion Considering the Long-term Impact

  • Who to Trust, How and Why: Untangling AI Ethics Principles, Trustworthiness and Trust

    Who to Trust, How and Why: Untangling AI Ethics Principles, Trustworthiness and Trust

  • One Map to Rule Them All? Google Maps as Digital Technical Object

    One Map to Rule Them All? Google Maps as Digital Technical Object

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer

Categories


• Blog
• Research Summaries
• Columns
• Core Principles of Responsible AI
• Special Topics

Signature Content


• The State Of AI Ethics

• The Living Dictionary

• The AI Ethics Brief

Learn More


• About

• Open Access Policy

• Contributions Policy

• Editorial Stance on AI Tools

• Press

• Donate

• Contact

The AI Ethics Brief (bi-weekly newsletter)

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.


Archive

  • © MONTREAL AI ETHICS INSTITUTE. All rights reserved 2024.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.