• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Core Principles of Responsible AI
    • Accountability
    • Fairness
    • Privacy
    • Safety and Security
    • Sustainability
    • Transparency
  • Special Topics
    • AI in Industry
    • Ethical Implications
    • Human-Centered Design
    • Regulatory Landscape
    • Technical Methods
  • Living Dictionary
  • State of AI Ethics
  • AI Ethics Brief
  • 🇫🇷
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

Fair and explainable machine learning under current legal frameworks

October 13, 2022

🔬 Research Summary by Przemyslaw Grabowicz, a Research Assistant Professor of Computer Science at the University of Massachusetts Amherst.

[Original paper by Przemyslaw Grabowicz, Nicholas Perello, Aarshee Mishra]


Overview: Powerful machine learning models can automatize decisions in critical areas of human lives, such as criminal pre-trial detention and hiring. These models are often trained on large datasets of historical decisions. However, past discriminatory human behavior may have tainted these decisions and datasets with discrimination. It is imperative to ask how can we ensure that models trained on such datasets do not discriminate against a certain race, gender, or protected groups, in accordance with current legal frameworks? We provide an answer based on our research publication, which was recently accepted to a premier conference on fairness, accountability, and transparency, ACM FAccT.


Introduction

Disparate Impact v. Business Necessity Showdown

Legal systems prohibit discrimination in a number of contexts. For example, the U.S. Civil Rights Acts outlaw discrimination in employment and housing. The doctrine of disparate impact typically is operationalized via the well-known 80% rule of thumb, which says that the fraction of hired candidates representing a certain protected group, e.g., females, should be at least 80% of the fraction of hired candidates representing other protected groups, e.g., males. 

If a particular employment practice has a disparate impact based on race, color, religion, sex, or national origin, then the employer must “demonstrate that the challenged practice is job related for the position in question and consistent with business necessity”. [1] For instance, the prerequisite for sufficient upper-body strength among firefighters may lead to the violation of the 80% rule w.r.t. gender groups, but this requirement could be justified by business necessity. This reasoning was applied in the Supreme Court case Ricci v. DeStefano, which justified the usage of a test for promotion to a management position of firefighters, despite its disparate impact.

Unfortunately, the business necessity clause leaves space for a loophole. Redundant or irrelevant prerequisites could be introduced in decision-making to discriminate under a pretext of business necessity, as in the historical events known as redlining where banks denied services to zip codes inhabited predominantly by people of color. This loophole is particularly easy to exploit for machine learning algorithms. In a blind pursuit for model accuracy, learning algorithms can automatically find surprisingly accurate proxies of protected groups in training datasets and propagate discrimination by skyrocketing the impact of respective proxy features on model outcomes, unless we explicitly prevent them from doing that…

Next, using the following hypothetical example of hiring automation, we will illustrate and compare three different learning methods that aim to prevent discrimination, including our new method (FaX AI) that prevents the aforementioned loophole.

Key Insights

Hypothetical Case of Hiring Automation

Say we want to create a machine learning system to decide who our company should hire from a pool of applicants with college degrees. Using our company’s past and, possibly, biased hiring history we put together training data with the following information from applicants:

  • The legibility of their resume (assuming we can objectively score this) 🟦
  • Whether they have a 2 year or a 4 year degree đźź©
  • Whether they went to a public or a private college/university 🟨
  • Their race 🟥
  • The hiring outcome (Yes/No) 🟪

Let’s say that the following graphical model represents the biased decision-making process. The directed edges correspond to causal relations, while the dashed edges correspond to non-causal associations, e.g., correlations between the respective attributes.

Since there was a casual relation between race and our hiring outcomes in the above process, we say that our historical hiring model was directly discriminatory. The goal of standard learning models is to obtain a model trained on a dataset that best predicts the target decisions. If the dataset is tainted by discrimination, like in our example, then models trained using it can perpetuate this discimination. Say that we trained a standard supervised learning model on our hiring dataset and it produces the following measures of impact (various feature influence measures from explainability literature could be used here, e.g., SHAP):

As we can see above, race has an impact on our trained model’s output for hiring decisions. Therefore, we can say that it discriminates directly. Our company definitely wants to make sure that we do not perpetuate this discrimination by race. How can we accomplish this goal?

Standard Learning Without Race

To avoid direct discrimination, we can drop the protected feature, race, when training a model using standard machine learning. This, however, is too naive and results in the following graphic model and feature impact measures.

This approach removes the impact of race from the model, however, it introduces the impact of “public or private institution”, a proxy feature associated with race. We refer to this as the inducement of indirect discrimination. With the absence of protected feature, this approach uses related feature(s) as a proxy, thus inducing indirect discrimination, which is as illegal as redlining.

Learning Methods Preventing Disparate Impact

Alternatively, we can try to use methods based on well-known fairness objectives, such as the ones preventing disparate impact and treatment [2] that lead to the following graphic model and feature impact:

While this method removes the impact of all features related to race, including the information about an applicant’s degree, it would significantly reduce model accuracy.  However, if the information about an applicant’s degree is relevant to the job, then its usage may be legally permitted for determining hiring outcomes, which would prevent the accuracy loss.

Fair and eXplainable Artificial Intelligence (FaX AI)

In our publication, we introduced a new learning method that permits the usage of features associated with the protected groups for business necessity reasons and drops the protected attribute from the model without inducing indirect discrimination through proxy features. This results in the following graphic model and feature impact measures in our hiring example:

This method permits the usage of applicants’ undergraduate degree type and removes the impact of the protected feature, race, without introducing an impact from “public or private institution” on the model’s output. The method prevents discrimination propagation and preserves model accuracy. From a technical perspective, the method is a simple and computationally efficient post-processing technique of a model of your choice, e.g., an expectation over a marginal distribution of the protected attribute (see details in our publication).

Between the lines

The learning methods described above can be ordered in terms of the strength of discrimination prevention, from the methods that do not have any discrimination prevention to the ones that aim to remove any association between outcomes and protected groups:

Learning methodDiscrimination prevention strengthLegal? [3]
Standard learning0. No preventionNo
Standard learning dropping the protected feature1. Direct discrimination prevented, but inducement of indirect discrimination allowedArguably no
Fair and eXplainable artificial intelligence2. Direct discrimination and inducement of indirect discrimination preventedArguably yes
Learning methods based on impact parity3. Affirmative actionYes

Businesses that are interested in using machine learning models for high-stake decision-making have a couple of legally defensible choices. First, they may embrace voluntary affirmative action and choose to use learning methods that combat disparate impact. These models, however, may be less accurate than models trained using other learning methods. Second option is to use the FaX AI method, since it permits the usage of attributes associated with the protected groups to a limited extent, which helps to develop more accurate models. We would argue that the method offers a minimal viable solution complying with legal requirements.To use FaX AI with your favorite models check out the FaX AI code library. If you are interested in learning the details of the FaX AI methods, please see our research paper.

References

[1] https://www.law.cornell.edu/uscode/text/42/2000e-2

[2] Various algorithms have been proposed to this end, e.g., the approach by Zafar et al. and the Exponentiated Gradient Reduction available in the AI360 library.

[3] The information in this column is only for illustrative purposes. It assumes that the training data was tainted by discrimination and that all information necessary to accurately model decisions is available and correctly used for model development.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

AI Policy Corner: Frontier AI Safety Commitments, AI Seoul Summit 2024

AI Policy Corner: The Colorado State Deepfakes Act

Special Edition: Honouring the Legacy of Abhishek Gupta (1992–2024)

AI Policy Corner: The Turkish Artificial Intelligence Law Proposal

From Funding Crisis to AI Misuse: Critical Digital Rights Challenges from RightsCon 2025

related posts

  • Towards an Understanding of Developers' Perceptions of Transparency in Software Development: A Preli...

    Towards an Understanding of Developers' Perceptions of Transparency in Software Development: A Preli...

  • Knowing Your Annotator: Rapidly Testing the Reliability of Affect Annotation

    Knowing Your Annotator: Rapidly Testing the Reliability of Affect Annotation

  • Energy and Policy Considerations in Deep Learning for NLP

    Energy and Policy Considerations in Deep Learning for NLP

  • Responsible Design Patterns for Machine Learning Pipelines

    Responsible Design Patterns for Machine Learning Pipelines

  • Bias and Fairness in Large Language Models: A Survey

    Bias and Fairness in Large Language Models: A Survey

  • Data Capitalism and the User: An Exploration of Privacy Cynicism in Germany

    Data Capitalism and the User: An Exploration of Privacy Cynicism in Germany

  • Responsible Use of Technology in Credit Reporting: White Paper

    Responsible Use of Technology in Credit Reporting: White Paper

  • The Whiteness of AI (Research Summary)

    The Whiteness of AI (Research Summary)

  • Research summary: Overcoming Barriers to Cross-Cultural Cooperation in AI Ethics and Governance

    Research summary: Overcoming Barriers to Cross-Cultural Cooperation in AI Ethics and Governance

  • FeedbackLogs: Recording and Incorporating Stakeholder Feedback into Machine Learning Pipelines

    FeedbackLogs: Recording and Incorporating Stakeholder Feedback into Machine Learning Pipelines

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer

Categories


• Blog
• Research Summaries
• Columns
• Core Principles of Responsible AI
• Special Topics

Signature Content


• The State Of AI Ethics

• The Living Dictionary

• The AI Ethics Brief

Learn More


• About

• Open Access Policy

• Contributions Policy

• Editorial Stance on AI Tools

• Press

• Donate

• Contact

The AI Ethics Brief (bi-weekly newsletter)

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.


Archive

  • © MONTREAL AI ETHICS INSTITUTE. All rights reserved 2024.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.