Implementing Responsible AI: Tensions and Trade-Offs Between Ethics Aspects

🔬 Research Summary by Conrad Sanderson, Senior Research Scientist, Data61/CSIRO, Australia.

[Original paper by Conrad Sanderson, David Douglas, and Qinghua Lu]

Overview: Many ethical principles for responsible AI have been proposed to allay concerns about the misuse and abuse of AI/ML systems, employing aspects such as privacy, accuracy, fairness, robustness, explainability, and transparency. However, tensions between these aspects pose difﬁculties for AI/ML developers seeking to follow these principles. As part of the ongoing effort to operationalize the principles into practice, we compile and discuss a catalog of 10 notable tensions, trade-offs, and other interactions between the underlying aspects in this paper. This catalog can help raise awareness of the possible interactions between aspects of ethics principles and facilitate well-supported judgments by the designers and developers of AI/ML systems.

Introduction

While AI ethics principles are well-intended, a major issue is that they are high-level and do not readily provide guidance on concretely implementing them within AI/ML systems. Furthermore, attempts at operationalizing these principles reveal that many can be in tension with each other. This, in turn, can lead to suboptimal outcomes, where the designers and developers of AI/ML systems may haphazardly resolve tensions by simply selecting one principle to be dominant rather than devising and incorporating well-balanced trade-offs between the principles in tension where the associated risks and benefits are gauged more thoroughly.

There are many observed interactions (positive, negative, and context-dependent) between various two- and three-sided combinations of the common aspects of AI ethics principles. However, descriptions of such interactions are spread across diverse and disparate literature. This may contribute to a lack of awareness among the designers and developers of AI/ML systems about the wide range and nature of the possible interactions between the underlying aspects.

Key Insights

The following interactions between AI ethics aspects are overviewed:

Accuracy vs. Robustness
Accuracy vs. Fairness
Accuracy vs. Privacy
Accuracy vs. Explainability
Fairness vs. Robustness
Fairness vs. Privacy
Fairness vs. Transparency
Privacy vs. Robustness
Transparency vs. Explainability
Transparency vs. Privacy and Robustness

A. Accuracy vs. Robustness.

ML models that achieve high accuracy within the conditions defined by the training dataset can quickly degrade when they do not consider that the conditions during deployment can be significantly different. The accuracy of AI/ML systems can also degrade over time due to natural data drift. Periodic recalibration and/or retraining may be necessary. However, blindly updating training data without thoroughly testing the retrained system may result in instabilities and/or unintended changes in operation.

Adversarial attacks may use data designed to fall outside the normal conditions an AI/ML model was trained to handle. Increasing robustness against adversarial attacks may involve augmenting the training dataset, for example, with perturbed versions of the original data.

With robustness defined as resiliency to label noise, a positive interaction between accuracy and robustness is likely present. With robustness defined as resistance to adversarial attacks, there are two links between accuracy and robustness. For small training datasets, considering robustness may increase accuracy, where robustness provisions act as a form of regularisation. For larger training datasets, considering robustness is likely to decrease accuracy.

B. Accuracy vs. Fairness.

Incorporating fairness provisions typically hinders accuracy. With fairness defined as group fairness, as fairness increases, overall accuracy is expected to decrease. Accuracy can be group-dependent even if the training dataset is balanced. Accuracy for a given group may be inherently lower than for other groups (e.g., the relatively low accuracy of face recognition technology for women with darker skin complexion).

In medical applications, the applicability of an AI/ML system may need to be restricted to specific ethnic groups due to limitations in the availability of sufficient high-quality training data. Without such restrictions, the system is likely to provide inaccurate outputs.

C. Accuracy vs. Privacy.

There are typically minor decreases in accuracy due to the use of differential privacy. The decrease in accuracy can be considerably greater for under-represented groups, which can affect group fairness. Accuracy can also be decreased when privacy considerations drive the minimization of the acquisition of personally identifiable information (PII). Exposure of such information can be due to security lapses or adversarial attacks. Where the use of federated learning is appropriate for addressing privacy concerns, accuracy may decrease if there are significant differences in the nature of the separate datasets held in the separate computing nodes.

D. Accuracy vs. Explainability.

Designers of AI/ML systems may choose to employ methods/models that are easier to interpret but are less accurate rather than more accurate approaches where it is more difficult to convey the underlying reasoning. However, accuracy and explainability are not necessarily mutually exclusive, as it is possible to devise a trade-off between them by varying model complexity. Simplified models can also be used purely to drive explanations rather than replacements for the original models. Explanations obtained via simplified models do not affect the accuracy of the original model but may inaccurately represent the underlying processing.

E. Fairness vs. Robustness.

For robust functionality, the range of allowable conditions may need to be constrained, which is in tension with fairness. Using an AI/ML system beyond what the training dataset covers may result in unreliable operation.

Group fairness can be reduced when considering robustness against adversarial attacks, even when the training dataset is balanced across the classes. Inherent differences in accuracy across groups can be exacerbated.

F. Fairness vs. Privacy.

Fairness tends to be reduced as a byproduct of employing differential privacy. Non-fairness can be exacerbated if the original AI/ML model produces non-fair outcomes, with the decrease in fairness more pronounced for groups with inadequate representation. It is possible to achieve approximate fairness under the constraints of differential privacy with specifically constructed machine learning models.

G. Fairness vs. Transparency.

It may become apparent through increasing transparency that an AI/ML system is unfair, thereby leading to pressure to increase fairness. Increasing transparency can be an effective means to demonstrate that the system is fair.

The appropriate definition of fairness for an AI/ML system will depend on the context of its use. Disclosing how designers have understood and implemented fairness will assist end users in evaluating whether it is likely to produce fair outcomes. This is particularly important where end users are accountable for the decisions made using the AI/ML system.

Malicious actors may exploit an AI/ML system based on the information disclosed about the system. Transparency may need to be limited to protect the fairness of the system. Designers should consider to whom the system should be transparent and what information should be provided.

H. Privacy vs. Robustness.

Privacy can be linked with robustness through the overarching aim of preventing the misuse of AI/ML systems. Both differential privacy and robustness to adversarial attacks can be incorporated into the same AI/ML approach.

I. Transparency vs. Explainability

Explainability can help with transparency, as having a documented understanding of the AI/ML pipeline can be co-opted to provide abridged details suitable for disclosure. Explainability should be targeted toward a particular audience. The audience may be end users or those affected by decisions made by the system, or regulators. The details of the system should be presented in formats that are easily understood by the target audience.

J. Transparency vs. Privacy and Robustness

Increasing transparency can negatively affect robustness and privacy, as revealing details about an AI/ML system can facilitate the potential re-identification of individuals and targeted adversarial attacks. Opaque systems may appear to be more difficult to analyze and hence more difficult to exploit. However, this parallels the traditional security approach in proprietary (closed-source) software, referred to as “security through obscurity,” and its efficacy is questioned in contrast to open-source software (OSS). It may be preferable to seek a balance between the two extremes of transparency, such as providing partial transparency to trusted parties, with the degree of transparency dependent on the degree of trust.

Transparency can also be hampered through the use of proprietary datasets for training AI/ML systems, where the owner of the dataset may prevent disclosure that the dataset is used or prevent sharing it with third parties.

Between the lines

The paper primarily focuses on two-sided interactions for which there is support in the literature. While there is one three-sided interaction (transparency vs. privacy and robustness), future avenues of research include the exploration of other possible three-sided interactions, such as accuracy vs. robustness vs. fairness, and accuracy vs. privacy vs. fairness.

Recognizing the tensions and other interactions between common aspects of high-level AI ethics principles is an important step towards operationalizing these principles. The catalogue presented in this paper can be helpful in raising awareness of the possible interactions between these aspects, as well as facilitating well- supported judgements by the designers and developers of AI/ML systems in addressing them.