• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Core Principles of Responsible AI
    • Accountability
    • Fairness
    • Privacy
    • Safety and Security
    • Sustainability
    • Transparency
  • Special Topics
    • AI in Industry
    • Ethical Implications
    • Human-Centered Design
    • Regulatory Landscape
    • Technical Methods
  • Living Dictionary
  • State of AI Ethics
  • AI Ethics Brief
  • 🇫🇷
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

Achieving Fairness at No Utility Cost via Data Reweighing with Influence

May 28, 2023

🔬 Research Summary by Peizhao Li, a Ph.D. Candidate at Brandeis University working on Machine Learning, with a special interest in Trustworthy and Responsible AI, Algorithmic Fairness, and Deep Learning.

[Original paper by Peizhao Li and Hongfu Liu]


Overview: With the fast development of algorithmic governance, fairness has become a compulsory property for machine learning models to suppress unintentional discrimination. In this paper, we focus on the pre-processing aspect for achieving fairness and propose a data reweighing approach that only adjusts the weight for samples in the training phase. Our algorithm computes individual weight for each training instance via influence function and linear programming, and in most cases, demonstrates cost-free fairness through vanilla classifiers.


Introduction

For artificial intelligence technology deployed in high-stakes applications like welfare distribution or school admission, it is essential to regulate algorithms and prevent unaware discrimination and unfairness in decision-making. Even though general data-driven algorithms are not designed to be unfair, the outcomes can unintentionally violate the AI principle of equality. Typically learning from historically biased data, the learner can retain or amplify the inherent bias if there is no proper constraint on data or algorithms. As a consequence, the decisions from these algorithms may disadvantage users in certain sensitive groups (e.g., women and African Americans), therefore raising societal concerns.

To mitigate unfairness algorithmically, the authors focus on the pre-processing aspect of fair learning algorithms, i.e., only transform the input training data for a machine learning model to make its decision to be fair. The pre-processing category directly diagnoses and corrects the source of bias and can be easily adapted to existing data analytic pipelines. The proposed method is to granularly compute a weight for every sample in the training set, and a vanilla model without any additional constraints can be trained on this reweighed training set and deliver fair predictions.

Key Insights

Characterizing Sample Influence

Consider a classifier trained on some training set. We can assess the contribution from one individual training sample by training the classifier two times with and without that training sample, respectively. By doing so, it is straightforward to know in a counterfactual how the model will change with regard to some typical measurements, e.g., fairness or predictive utility.

However, it can be prohibitively expensive to retrain the model many times. Influence function from robust statistics provides a first-order approximation to the contribution of a training sample. It measures the effect of changing an infinitesimal weight in a sample, then linearly extrapolates to a whole removal of a sample. The influence function offers us an option to estimate the change of a model toward any evaluative function by removing some training samples from the training set.

Fairness at No Utility Cost

Having the influence function at hand, and by writing the fairness goal into a differentiable equation, we can quantify the impact of one training sample on two objectives: predictive utility and fairness. We use the influence function to calculate how the model will change and how the utility and fairness will change by reweighing a training instance. Theoretically, under the assumption of training data sufficiency and diversity, and with a proper model dimension, the authors prove that some training data reweighing strategy always exists to improve the fairness while at least keeping the utility not decreasing.

Having that theoretical finding, the authors solve the reweighing through linear programming and compute the individual weight for each training instance. The linear programs have constraints from both utility and fairness to achieve a fair model prediction without harming the prediction performance. Solving the linear programming problem can be very fast, i.e., within a few seconds, for a tabular dataset with 30k+ training samples.

The authors demonstrate the proposed reweighing approach on multiple tabular datasets compared to several pre-processing and in-processing methods. Experimental results show that such reweighing method can achieve cost-free fairness in most cases. In contrast, other competitive methods usually get fair results by introducing non-negligible harm to the model’s utility.

Between the lines

In recent years, many algorithms have been developed to make ML models meet criteria from AI ethics. However, many algorithms bring non-trivial change to the original model, greatly changing the utility performance. Fairness at no utility cost is a favorable property since it could help to popularize fair algorithms for extensive utility-driven products and alleviate the concerns from the deployment of fair algorithms.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

Canada’s Minister of AI and Digital Innovation is a Historic First. Here’s What We Recommend.

Am I Literate? Redefining Literacy in the Age of Artificial Intelligence

AI Policy Corner: The Texas Responsible AI Governance Act

AI Policy Corner: Singapore’s National AI Strategy 2.0

AI Governance in a Competitive World: Balancing Innovation, Regulation and Ethics | Point Zero Forum 2025

related posts

  • CRUSH: Contextually Regularized and User Anchored Self-Supervised Hate Speech Detection

    CRUSH: Contextually Regularized and User Anchored Self-Supervised Hate Speech Detection

  • The State of AI Ethics Report (Volume 4)

    The State of AI Ethics Report (Volume 4)

  • Study of Competition Issues in Data-Driven Markets in Canada

    Study of Competition Issues in Data-Driven Markets in Canada

  • Governance by Algorithms (Research Summary)

    Governance by Algorithms (Research Summary)

  • Research summary: PolicyKit: Building Governance in Online Communities

    Research summary: PolicyKit: Building Governance in Online Communities

  • Dual Governance: The intersection of centralized regulation and crowdsourced safety mechanisms for G...

    Dual Governance: The intersection of centralized regulation and crowdsourced safety mechanisms for G...

  • Speciesist bias in AI - How AI applications perpetuate discrimination and unfair outcomes against an...

    Speciesist bias in AI - How AI applications perpetuate discrimination and unfair outcomes against an...

  • Green Algorithms: Quantifying the Carbon Emissions of Computation (Research Summary)

    Green Algorithms: Quantifying the Carbon Emissions of Computation (Research Summary)

  • Fairness Definitions Explained (Research Summary)

    Fairness Definitions Explained (Research Summary)

  • Technological trajectories as an outcome of the structure-agency interplay at the national level: In...

    Technological trajectories as an outcome of the structure-agency interplay at the national level: In...

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer

Categories


• Blog
• Research Summaries
• Columns
• Core Principles of Responsible AI
• Special Topics

Signature Content


• The State Of AI Ethics

• The Living Dictionary

• The AI Ethics Brief

Learn More


• About

• Open Access Policy

• Contributions Policy

• Editorial Stance on AI Tools

• Press

• Donate

• Contact

The AI Ethics Brief (bi-weekly newsletter)

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.


Archive

  • © MONTREAL AI ETHICS INSTITUTE. All rights reserved 2024.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.