🔬 Research Summary by Judy Hanwen Shen, a Computer Science Ph.D. student at Stanford University broadly working on algorithmic fairness, differential privacy, and explainability through the lens of data composition.
[Original paper by Leonard Berrada*, Soham De*, Judy Hanwen Shen*, Jamie Hayes, Robert Stanforth, David Stutz, Pushmeet Kohli, Samuel L. Smith, and Borja Balle]
Overview: In high-stakes settings such as health care, machine learning models should uphold both privacy protections for data contributors and fairness across subgroups upon which the models will be deployed. Although prior works have suggested that tradeoffs may exist between accuracy, privacy, and fairness, this paper demonstrates that models fine-tuned with differential privacy can achieve accuracy comparable to that of non-private classifiers. Consequently, we show that privacy-preserving models in this regime do not display greater performance disparities across demographic groups than non-private models.
Introduction
When seeking medical advice, whether online or in a clinic, individuals outside the majority group may find themselves uncertain about the validity of the information they receive, particularly about their unique identity. The ongoing digitalization of health care presents an opportunity to develop algorithms that yield improved outcomes for marginalized subpopulations. In this context, preserving the confidentiality of one’s health records becomes a critical goal, alongside leveraging the predictive capabilities of models trained on population-level records. Ideally, any machine learning model deployed in a healthcare setting must have accuracy, privacy, and fairness.
The holy grail of trustworthy machine learning is achieving societally aligned outcomes in conjunction with excellent model performance. In our work, we question previously conceived notions of the accuracy and fairness shortcomings of models trained with differential privacy (DP). We introduce a reliable and accurate method for DP fine-tuning large vision models and show that we can reach the practical performance of previously deployed non-private models. Furthermore, these highly accurate models exhibit disparities across subpopulations, which are no larger than those we observe in non-private models with comparable accuracy.
Key Insights
Training highly accurate models with differential privacy
Differential privacy (DP) is the gold standard for training neural networks while preserving the privacy of individual data. Indeed, this technique guarantees that the influence of any single training data point remains limited and obfuscated when training the model. However, due to the noise employed for the obfuscation, this privacy protection can come at the cost of model accuracy, particularly in modern settings where model parameters are high dimensional. This questions whether privacy protections can be justified at the cost of accuracy in safety-critical domains such as health care.
Our work introduces practical techniques to close the accuracy gap between private and non-private models on image classification tasks. These techniques include parameter averaging to improve model convergence and using model families without batch-normalization. Our results demonstrate that using publicly available datasets such as ImageNet for pre-training and then privacy-preserving methods for fine-tuning yields private chest X-ray classifiers that closely match non-private models for AUC.
When differential privacy does not necessarily imply worsened disparities
Another challenge of deploying differentially private models is the potential subgroup disparities that may be introduced by private training. For example, some subgroups defined by class labels or sensitive attributes may experience greater accuracy deterioration than others under private training. In contrast, our work finds that models trained with differential privacy, both fine-tuned and trained from scratch, exhibit similar group accuracy disparities to non-private models at the same accuracy. We first highlight the necessity of evaluating disparities using averaged weights to overcome the higher noise level in models trained with DP-SGD. Secondly, AUC on chest-x-ray classification is not systematically worse for private than for non-private models. Tradeoffs between subgroup outcomes and differential privacy can be mitigated by training more accurate models for the important datasets we examine.
Between the lines
Differential privacy is often considered a non-practical technology for model training due to its perceived impact on accuracy and fairness. Our findings show that it is sometimes possible to achieve very good accuracy, fairness, and privacy simultaneously. While the repercussions of overlooking fairness and privacy may not be immediately evident on common academic benchmarks, such considerations are absolutely essential when training and deploying models on real-world data.
The creation of AI assistive technology that is aligned with human values necessitates a thorough examination of the diverse and often intricate desiderata specific to each use case. While our work specifically investigates the alignment of X-ray classification with privacy and fairness, identifying which values to prioritize across various other practical problems is a ripe area for future research.