Beyond the Frontier: Fairness Without Accuracy Loss

🔬 Research Summary by Rishi Balakrishnan, a student at UC Berkeley passionate about algorithmic fairness, privacy, and trustworthy AI more broadly.

[Original paper by Ira Globus-Harris, Michael Kearns, Aaron Roth]

Overview: In this paper, the authors propose a new framework for “bias bounties”, a method for auditing automated systems where system users are incentivized to identify and report instances of unfairness. The authors’ proposed framework takes bias bounties a step further, so that feedback not only highlights possible discrimination but also automatically improves the model.

Introduction

In 2020 during the height of the pandemic, Zoom users discovered something interesting – Zoom’s virtual background often didn’t recognize black faces, erasing them from the screen in the process. When they took to Twitter to post images of Zoom’s differential treatment, they found that Twitter’s cropping algorithm wasn’t much better – it also failed to recognize black faces and cropped out those parts of the image [1].

In response, Twitter hosted a “bias bounty” for its image cropping algorithm, a model where Twitter users were monetarily incentivized to report instances of bias in its image cropping algorithm [2]. Bias bounties have quickly become a framework for auditing algorithms for fairness concerns, and in this paper, the authors propose an improved framework for bias bounties that not only finds bias but automatically improves the underlying model as well.

Technical summary

Before diving into the technical details of their framework, the authors first outline the fairness criteria that they use in designing their system. Popular notions of fairness (like demographic parity and equalized odds) aim to standardize model outputs and performance across different groups. While a reasonable goal on face, a common effect of these fairness criteria is to hurt accuracy for different groups in an effort to make sure performance is the same across the board. As the authors rightly point out, this is far from ideal and pits different groups in the training data against each other. As a result, the authors instead use a notion of fairness called “multigroup agnostic learnability”, an idea that says that fairness is achieved when the model performs the best it can for every group.

Once defining their notion of fairness, the authors introduce their framework for bias bounties. Within this template, stakeholders are empowered to not only find groups that the model underperforms for, but also create models that perform better for that group. They can then report their findings (the group and improved model), and once their findings are verified, their new model is incorporated into the current one. The integration process is simple: if the system is classifying somebody within the identified group, it classifies that individual with the new, proposed model. Otherwise, it uses the original model. This approach has a number of benefits laid out by the authors: groups can be arbitrarily specified (and can even overlap), and unlike many other fairness approaches where protected groups need to be specified before creating the model, this bias bounty template flexibly incorporates many different group definitions. Moreover, the authors prove that such a system can quickly converge on to the optimal classifier for the training data. In many ways, this framework is a significant improvement on current approaches to ensuring good model performance across various subgroups.

Bias bounties and participatory research

The bias bounty is a version of algorithmic auditing that is far more transparent and accountable to the general public. Many proposed algorithmic audits are carried out by the experts appointed to examine an algorithm – however, participating in bias bounties is broadly available to the public. This framework for bias bounties also offers increased control to affected stakeholders, because now not only can they identify biases within models that affect them, they can propose improvements that are guaranteed to be incorporated. In this way, this version of a bias bounty is much more aligned with the framework of participatory research, where affected stakeholders are able to provide input to the way that the model should function and ensure that machine learning systems that affect them are also responsive to their needs.

Areas for improvement

While the paper provides a great new template for a bias bounty model, room for future improvements still exists. The authors explicitly mention several limitations of this model: primarily, it can only detect biases that are within the dataset. Concretely, this means that people who want to submit better models can only evaluate these models on the training data released by the model creators – they cannot use data collected from other sources. This makes it impossible to fix biases that emerge as a result of a biased data collection process. However, the authors argue that while their template cannot fix issues with data collection, it can provide indicators of where better data collection is needed. If several auditors propose models for a given subgroup but model performance on that subgroup still remains low, that provides an indication that better data collection is needed for that group.

Another area for improvement is the notion of fairness used in the paper. Simply aiming to increase accuracy for all groups is a rather weak notion of fairness; when the data itself is biased, improving accuracy means reproducing harm. An example: some years ago, Amazon aimed to make their hiring process more efficient by introducing an algorithm to automatically sort through candidate applications. The easiest (and most sensible) training data to gather was applications from previous candidates who were eventually hired. However, Amazon quickly found that their hiring algorithm was discriminatory – most of their engineers were men, and the machine learning algorithm very quickly picked up on (and replicated) the bias against women in the hiring process [3]. In situations like these – and even higher stakes contexts like criminal justice – maximizing accuracy means denying opportunities to minority groups. The authors criticize notions of fairness that aim to equalize opportunities across different subgroups because they reduce performance across all subgroups, but these are not the only two options. More recent notions of fairness aim to move past statistical estimations and adopt a more causal view, where the effects of sensitive attributes are traced through specific pathways and deemed fair or unfair. For example, a job applicant’s gender may affect their choice of major in college, but their previous educational experience is still relevant to their merit as a job candidate. However, a candidate’s gender may also affect their marital status, which should not affect their merit as a job candidate.

Finally, while well-intended, explicitly using group membership to determine which model to use may run into several problems. In many cases group membership may not be that easy to determine. The authors study a setup where group membership is one of the features of the data, but in cases like the Twitter image cropping bias bounty, a picture of a person doesn’t come annotated with their race. Moreover, explicitly using group membership to determine which model to use may run into discrimination issues concerning the “disparate treatment” doctrine, a legal framework that prevents discrimination by forbidding the use of protected attributes in decision-making. Many solutions that aim to improve model fairness risk running into this problem by using sensitive attributes as model inputs [4], but that risk is heightened with this bias bounty framework because group membership (like race) is not only an input but also determines which model to use.

Between the lines

In many ways, this work sets the stage for increased stakeholder participation in machine learning algorithms, something that’s becoming necessary as algorithms affect broader swathes of society. The authors’ efforts to conduct technical research that elicits and incorporates stakeholder feedback into model development is incredibly important. Research like this helps make machine learning less insular, and represents an important step in integrating pure machine learning research with broader socio-technical efforts to understand the effects of machine learning on our lives.