• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
    • Tech Futures
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • State of AI Ethics Report Volume 8 (2026): Call for Contributors
    • Volume 7 (November 2025)
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

Fairness implications of encoding protected categorical attributes

June 17, 2022

šŸ”¬ Research Summary by Carlos Mougan, a Ph.D. candidate at the University of Southampton within the Marie Sklodowska-Curie ITN of NoBias.

[Original paper by Carlos Mougan, Jose M. Alvarez, Gourab K Patro, Salvatore Ruggieri and Steffen Staab]


Overview: Protected attributes (such as gender, religion, or race) are often presented as categorical features that need to be encoded before feeding them into an ML algorithm. Encoding these attributes is paramount as they determine the way the algorithm will learn from the data. Categorical feature encoding has a direct impact on the model performance and fairness. In this work, we investigate the accuracy and fairness implications of the two most well-known encoders: one-hot encoding and target encoding.Ā 


Introduction

Sensitive attributes are central to fairness and so are their handling throughout the machine learning pipeline. Many machine learning algorithms require categorical attributes to be suitably encoded as numerical data before being fed to algorithms.

What are the implications of encoding categorical protected attributes?

In previous fairness works, the presence of sensitive attributes is assumed, and so is their feature encoding. Given the range of available categorical encoding methods and the fact that they often must deal with sensitive attributes, we believe this first study on the subject to be highly relevant to the fair machine learning community. 

What encoding method is best in terms of fairness? Can we improve fairness with encoding hyperparameters? Does having a fair model imply having a less performant ML model?

Key Insights

Types of induced bias

Two types  bias are induced when encoding categorical protected attributes, to illustrate this types of induced bias we use the famous Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) dataset:

  • Irreducible bias: The primary problem of unfairness is acquired from the use of protected attributes.  It refers to (direct) group discrimination arising from the categorization of groups into labels: more data about the compared groups do not reduce this type of bias. In the COMPAS dataset, criminal ethnicity was paramount when determining recidivism scores; the numerical encoding of large ethnicity groups such as African-Americans or Caucasian may lead to discrimination, which is an unfair effect coming from the irreducible bias.
  • Reducible bias:  arises due to the variance when encoding groups that have a small statistical representation, sometimes even only very few instances of the group.  Reducible bias can be found and introduced when encoding the ethnicity category Arabic, which is rarely represented in the data, provoking a large sampling variance that ends in an almost random and unrealistic encoding.

Encoding methods

Handling categorical features is a common problem in machine learning, given that many algorithms needs to be fed with numerical data. We review two of the most well-known traditional methods:

An illustrative example of a feature that need to be encoded

  • One hot encoding: Is the most established encoding method for categorical features, is also the default method within the fairness literature. This encoding method constructs orthogonal and equidistant vectors for each category.

One Hot Encoding over illustrative example

  • Target Encoding: categorical features are replaced with the mean target value of each respective category. This technique handles high cardinality categorical data and categories are ordered. The main drawback of target encoding appears when categories with few samples are replaced by values close to the desired target. This introduces bias to the model as it over-trusts the target encoded feature and makes the model prone to overfitting and reducible bias.

Furthermore, this type of encoding allows a regularization hyperparameter of adding Gaussian noise to the category. 

Target Encoding over illustrative example.

Figure: Comparing one-hot encoding and target encoding regularization (Gaussian noise) for the Logistic Regression overĀ  COMPAS dataset. The protected group is African-American. The reference group is Caucasian. Red dots regard different regularization parameters: the darker the red the higher the regularization. Blue dot regards the one-hot encoding.

In the above experiment, we showed that the most used categorical encoding method in the fair machine learning literature, one-hot encoding, discriminates more in terms of equal opportunity fairness than target encoding. However, target encoding shows promising results. Target encoding using Gaussian regularization show improvements under the presence of both types of biases, with the risk of a noticeable loss of model performance in the case of over-parametrization.

Between the lines

In recent years we have seen algorithmic methods aiming to improve fairness in data-driven systems from many perspectives: data collection, pre-processing, in-processing, and post-processing steps. In this work, we have focused on how the encoding of categorical attributes (a common pre-processing step) can reconcile model quality and fairness. 

 A common underpinning of much of the work in fair ML is the assumption that trade-offs between equity and accuracy may necessitate complex methods or difficult policy choices [Rodolfa et al.]

Since target encoding with regularization is easy to perform and does not require significant changes to the machine learning models, it can be explored in the future as a suitable complementary for in-processing methods in fair machine learning.

Acknowledgments

This work has received funding from the European Union’s Horizon 2020 research and innovation program under Marie Sklodowska-Curie Actions (grant agreement number 860630) for the project ā€˜ā€™NoBIAS – Artificial Intelligence without Bias’’

Disclaimer

This work reflects only the authors’ views and the European Research Executive Agency (REA) is not responsible for any use that may be made of the information it contains.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

SAIER Volume 8 (2026)

SAIER Volume 8 (2026) Call for Contributors

šŸ” SEARCH

Spotlight

Vertically- and horizontally-placed chess boards and chess pieces

Tech Futures: At the Frontier of Fear, Uncertainty and Doubt

Tech Futures: Introducing the Resist List

An abstract spiral of dark circles appears at the centre, resembling a tornado. Several vintage magazine covers and advertisements are being drawn toward the spiral. The artworks that have already been pulled into it are becoming distorted and replaced with clusters of numbers representing their numerical embeddings.

Tech Futures: Better Imagination for Better Tech Futures

This image is a collage with a colourful Japanese vintage landscape showing a mountain, hills, flowers and other plants and a small stream. There are 3 large black data servers placed in the bottom half of the image, with a cloud of black smoke emitting from them, partly obscuring the scenery.

Tech Futures: Crafting Participatory Tech Futures

A network diagram with lots of little emojis, organised in clusters.

Tech Futures: AI For and Against Knowledge

related posts

  • Ethics-based auditing of automated decision-making systems: intervention points and policy implicati...

    Ethics-based auditing of automated decision-making systems: intervention points and policy implicati...

  • Modeling Content Creator Incentives on Algorithm-Curated Platforms

    Modeling Content Creator Incentives on Algorithm-Curated Platforms

  • Do Large GPT Models Discover Moral Dimensions in Language Representations? A Topological Study Of Se...

    Do Large GPT Models Discover Moral Dimensions in Language Representations? A Topological Study Of Se...

  • Breaking Fair Binary Classification with Optimal Flipping Attacks

    Breaking Fair Binary Classification with Optimal Flipping Attacks

  • The Impact of Artificial Intelligence on Military Defence and Security

    The Impact of Artificial Intelligence on Military Defence and Security

  • Research summary: Algorithmic Injustices towards a Relational Ethics

    Research summary: Algorithmic Injustices towards a Relational Ethics

  • From the Gut? Questions on Artificial Intelligence and Music

    From the Gut? Questions on Artificial Intelligence and Music

  • On the sui generis value capture of new digital technologies: The case of AI

    On the sui generis value capture of new digital technologies: The case of AI

  • Tiny, Always-on and Fragile: Bias Propagation through Design Choices in On-device Machine Learning W...

    Tiny, Always-on and Fragile: Bias Propagation through Design Choices in On-device Machine Learning W...

  • NATO Artificial Intelligence Strategy

    NATO Artificial Intelligence Strategy

Partners

  • Ā 
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • Ā© 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.