Melting contestation: insurance fairness and machine learning

🔬 Research Summary by Laurence Barry and Arthur Charpentier.

Laurence Barry is an independent actuary and a researcher at PARI (Programme de Recherche sur l’Appréhension des Risques et des Incertitudes, ENSAE/ Sciences po)

Arthur Charpentier is a professor in Montréal, and is the former director of the Data Science for Actuaries program of the French Institute of Actuaries.

[Original paper by Laurence Barry and Arthur Charpentier]

Overview: Machine learning tends to replace the actuary in the selection of features and the building of pricing models. However, avoiding subjective judgments thanks to automation does not necessarily mean that biases are removed. Nor does the absence of bias warrant fairness. This paper critically analyzes discrimination and insurance fairness with machine learning.

Introduction

Insurers have often been confronted with data-related issues of fairness and discrimination. This paper provides a comparative review of discrimination issues raised by traditional statistics versus machine learning in the context of insurance. We first examine historical contestations of insurance classification, showing that it was organized along three types of bias: pure stereotypes, non-causal correlations, or causal effects that a society chooses to protect against, which are thus the main sources of dispute. The lens of this typology then allows us to look anew at the potential biases in insurance pricing implied by big data and machine learning, showing that despite utopic claims, social stereotypes continue to plague data, thus threatening to unconsciously reproduce these discriminations in insurance. To counter these effects, algorithmic fairness attempts to define mathematical indicators of non-bias. This may prove insufficient since it assumes specific protected groups exist, which could only be made visible through public debate and contestation. These are less likely if the right to explanation is realized through personalized algorithms, which could reinforce the individualized perception of the society that blocks rather than encourages collective mobilization.

Key Insights

Insurance fairness is a dynamic concept, that depends on historical, cultural but also technical contexts. At the height of the industrial era, the veil of ignorance explained the equality of the greatest number in the face of an unknown adversity, justifying a very broad coverage in terms of

solidarity (Ewald, 1986). During the twentieth century, more segmented models were put in place with the growing capacities of data collection and calculation. Still, insurance remained based on risk classes, perceived as homogeneous groups of similar people (Barry, 2020). From the 1980s onwards, controversies arose over using this or that variable that feeds current criticisms of the biases and discriminations associated with machine learning. Examining this history allows us to identify a few main families of bias in traditional classification practices and their recent displacement with machine-learning algorithms.

Type 1 bias: pure prejudice

Some critics pointed out the prejudices that informed the statistician’s choice of variables and warned against the “myth of the actuary,” who would build supposedly objective models based on his own conception of what is risky, moral, or legitimate behavior. In principle, this type of bias should have disappeared with big data, as most are now natively digital, and thus allow us to bypass earlier manual quantification work. However, over the last twenty years, the embeddedness of social prejudices in data has been amply established; blind use of machine learning would then reproduce these biases in the models.

Type 2 bias: non-causal correlation

Another criticism pointed out the use of correlated variables that are not truly causal. For example, the use of the man/woman parameter and the credit score have provoked controversies built on this kind of argument. The solution, surely inoperable in practice, would be to limit models to purely causal variables. Interestingly, some data scientists today advocate for a shift to a new episteme, where the model’s accuracy, rather than its interpretability and its causal format, would find its legitimacy. Current algorithms indeed capture correlations without making these links explicit. This magnifies type 2 biases and further introduces a new bias due to their opacity, even if as the counterpart of greater precision.

Type 3 bias: reinforcing social injustices through classification

Another family of critics rejected the idea of a “true” classification altogether. Type 3 biases in insurance come from truly causal variables that reflect a hazard that society has chosen to mutualize. The use of genetic data in health insurance, for example, is prohibited in most countries. In this case, insurance is seen as a way not to reflect the risk but to have it borne by the entire insured population by eliminating the variable from the models. However, eliminating protected variables was very effective in traditional models but is much more difficult to implement with big data and machine learning, respectively, because protected variables are captured via their correlation with others and because the opacity of the algorithms makes highlighting this effect more complex.

Fairness through contestation

Beyond the difficulties of eliminating traditional biases with new technologies, historical studies also highlight the importance of debate and contestation to define models that could be perceived as fair. In this matter, the attempt to ensure algorithmic fairness using “absence of bias” mathematical indicators might prove insufficient. Only through discussions and sometimes legal actions could protected groups be recognized as such, which algorithmic fairness takes as a given.

While understanding the model is certainly necessary, explainability does not warrant contestability. In this matter, the locality of current explanatory algorithms is inevitable due to their non-linearity. But it might reinforce the individualized perception of the social, which blocks rather than fosters collective mobilization. Should insurance then stick to the good old pricing tables, for which all the parameters are explicit, known in advance, and therefore open to challenge? Just as any scientific theory must be falsifiable, a pricing system should be transparent to be contestable. In any case, one should be wary of replacing the myth of the actuary with the myth of the algorithm.

Between the lines

Current research on actuarial fairness is focused on the absence of biased mathematical indices. But the capacity to debate on the use of specific features that might otherwise remain unnoticed is crucial to ensure the basic premise of insurance: that the most vulnerable get protection. This promise sometimes means accepting cross-subsidies between groups (hence, statistical biases) for the sake of the common good.