🔬 Research Summary by Vincent Grari, a Qualified Actuary and Research Scientist at Sorbonne University and Axa Group specialized in fair machine learning.
[Original paper by Vincent Grari, Arthur Charpentier, Sylvain Lamprier, Marcin Detyniecki]
Overview: Sacrificing predictive performance is often viewed as an unacceptable option in machine learning. However, we note that to satisfy a fairness objective, the predictive performance can be reduced too much, especially for generic fair algorithms. Therefore, we have developed a more suitable and practical framework by using autoencoders techniques.
Introduction
Over the past few years, machine learning algorithms have emerged in many different fields of application. However, this development is accompanied by a growing concern about their potential threats, such as their ability to reproduce discrimination against a particular group of people based on sensitive characteristics (e.g., religion, race, gender, etc.). The standard machine learning models only optimize accuracy and are prone to learn all the relevant information for the task whether they are sensitive or not. In particular, algorithms trained on biased data have been shown to be susceptible to learn, perpetuate or even reinforce these biases. Many incidents of discrimination have been documented these recent years. For example, an algorithmic model used to generate predictions of criminal recidivism in the United States (COMPAS) discriminated against black defendants. Also, discrimination based on gender and race could be demonstrated for targeted and automated online advertising on employment opportunities. A new field of research has emerged to find solutions to this problem “ fair machine learning “ where standard approaches are delivered to debias any task. However, while this is effective in a large majority of cases, we argue that generic fairness algorithms can be counterproductive for specific applications and in particular for insurance pricing.
Key Insights
Recently, there has been a dramatic rise of interest in academia and society for enforcing fairness in the prediction of machine learning models. Satisfying fairness on gender information, for example, as a long history. In 1978, the Supreme Court in the U.S. stated that the differential was discriminatory in its “treatment of a person in a manner which, but for that person’s sex, would be different.” The statute, which focuses on fairness to individuals, rather than fairness to classes, precludes treating individuals as simply components of a group such as the sexual class here. Even though it is true that women as a class outlive men, that generalization cannot justify disqualifying an individual to whom it does not apply. Following that decision, theoretical discussions about fairness of gender-based pricing started. In Europe, on 13 December 2004, the so-called “gender directive” was introduced (Council Directive 2004/113/EC), even if it took almost ten years to provide legal guidelines on the application of the directive to insurance activities. The goal was to enforce the principle of equal treatment between men and women in the access to and supply of goods and services, including insurance. As a direct consequence, it prohibited the use of gender as a rating variable in insurance pricing. Gender equality in the European Union (EU) has been supposed to be ensured from 21 December 2012. However, even if the sensitive variable is not included in the training data, the training set, complex correlations from other features may provide unexpected sensitive information. For example, the height, the type of the car and the occupation of the policyholder can have a very strong correlation with its gender. For this reason, many fairness definitions have appeared in recent years as equity can be seen totally differently for everyone, it is subjective. For example, some have listed more than 20 different fairness criteria. To be able to satisfy some of them, especially the most famous one as “ demographic parity” which requires the independence between the prediction and the sensitive variable, it is essential to totally review the way we train the algorithm.
Fair Algorithms
Significant work has been done to include fairness constraints in the training objective of machine learning algorithms. In particular the emergence of Generative Adversarial Networks (GANs) known for generating very realistic images of people or objects often even indistinguishable from reality has provided the required underpinning for fair predictors. Although their objectives are totally different, they have one thing in common! It requires the learning of two neural networks that contest with each other in a game. In order to establish a fair predictor model a second model is optimized simultaneously in a min-max game in order to find a trade-off between prediction accuracy and fairness. This method has been revealed as the most powerful framework for settings where acting on the training process is an option. Many approaches using this technique have been used in the literature to address a wide range of bias in the prediction. However, we claim that mitigating undesired biases with a generic fair algorithm can be counterproductive for specific applications. For example, mitigating unwanted biases in insurance pricing with a traditional fair algorithm may be insufficient to maintain adequate accuracy. Indeed, the traditional pricing model is currently built in a two-stage structure that considers many potentially biased components such as car or geographic risks. We have shown that this traditional structure has significant limitations in achieving fairness. There is a risk of not acting correctly all along with the components (for e.g. some of them can be unfairly neutralized on the objective predictive task). Therefore, for these purposes, we have developed a more suitable and effective framework to satisfy a fairness objective while maintaining a sufficient level of predictor accuracy. We have extended the use of autoencoders to generate multiple aggregated pricing factors in a fairness context.
Between the lines
At the moment, we note that although the fairness community is making significant advances, there is a lack of effort on specific applications. It does not seem feasible to focus only on generic algorithms. We argue that the next step in the fairness research will be to investigate fair algorithms adapted for specific applications. Some practical, real-world applications do not permit sacrificing the performance of algorithms, such as medical applications (e.g., brain tumor detection). We must seek to maximize their performance while remaining fair. Aggregate components are commonly used for predictive machine learning tasks, but they are often not suitable for fairness. We believe that going through autoencoder techniques will improve this.