• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Core Principles of Responsible AI
    • Accountability
    • Fairness
    • Privacy
    • Safety and Security
    • Sustainability
    • Transparency
  • Special Topics
    • AI in Industry
    • Ethical Implications
    • Human-Centered Design
    • Regulatory Landscape
    • Technical Methods
  • Living Dictionary
  • State of AI Ethics
  • AI Ethics Brief
  • 🇫🇷
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

Target specification bias, counterfactual prediction, and algorithmic fairness in healthcare

September 16, 2023

🔬 Research Summary by Eran Tal, Canada Research Chair in Data Ethics and Associate Professor of Philosophy at McGill University. He studies the epistemology and ethics of data collection and data use in scientific research, healthcare, and policy.

[Original paper by Eran Tal]


Overview: This paper exposes a hidden and widespread type of bias of healthcare decision-support tools based on supervised ML: target specification bias. This bias stems from the fact that decision-makers, e.g., physicians, typically specify their desired target of prediction differently than the way algorithm designers operationalize it. This type of bias cannot be resolved by improvements to the data or the model alone. Instead, tackling target specification bias requires a fundamental shift of approach to how model accuracy is evaluated and reported to decision-makers.


Introduction

Sometimes, machine learning models become good at predicting a variable, but that variable differs from what users care about predicting. A well-known example dates back to the mid-1990s. A neural net trained on health records from Pittsburgh-area hospitals learned to associate asthma with a low risk of death from pneumonia. The association was real. Asthmatics who presented pneumonia received aggressive care that lowered their mortality risk below the general population’s. And yet, to physicians who need to allocate hospital beds, this association was a confounder that would have dangerously de-prioritized asthmatics had it not been caught in time. Physicians needed the model to predict mortality risk when all other things are equal, not mortality risk in the real, messy world.  

This is an example of target specification bias: a mismatch between the specification of a target variable and its operationalization by an ML model. Commonly mistaken for a transparency problem, target specification bias is an accuracy problem that affects opaque and transparent models alike. Target specification bias is still widely overlooked when evaluating model accuracy. This is due largely to the overly simplistic, ‘label-matching’ conception of accuracy currently prevalent in the ML community. This paper characterizes target specification bias, distinguishes it from other prevalent types of bias in ML, explains how it contributes to inaccuracy, and offers ways of mitigating it.

Key Insights

What is target specification bias? 

Target specification bias is a mismatch between the way decision-makers specify the variable they need to predict and the way this variable is operationalized by the designers and developers of a decision-support tool. The mismatch is often subtle and stems from the fact that decision-makers are typically interested in predicting counterfactual outcomes rather than actual scenarios. 

For example, physicians who make treatment decisions are interested in predicting patients’ health outcomes under a counterfactual scenario where all patients receive the same treatment. By contrast, the model is necessarily trained on data that reflect actual scenarios. Different populations receive different treatments based on need and availability in an actual scenario, as the pneumonia example above shows. 

The same holds true for physicians who decide which patients to refer for a diagnostic test. Whether and when a condition is diagnosed partially depends on the judgment of clinicians and on the availability and cost of diagnostic services. For the referring physician, these factors are all confounders. The physician is interested in predicting a patient’s diagnosis in a counterfactual world, where all patients have access to timely and accurate diagnostic tests. 

Although the conceptual distinction between actual and counterfactual variable specification is subtle, the practical consequences of ignoring it can be severe. When left uncorrected, target specification bias leads to overestimating predictive accuracy, inefficient utilization of medical resources, and suboptimal decisions that can harm patients.

How does target specification bias arise? 

There are several misconceptions about how target specification bias arises. Cases of target specification bias are sometimes mistakenly classified as transparency problems. While increased transparency can reveal the presence of target specification bias, this bias affects opaque and intelligible (or ‘explainable’) models alike. Target specification bias also does not result from insufficient, unreliable, incomplete, or unrepresentative data. On the contrary, this type of bias tends to become more pronounced as data quality is improved. This is because the confounding effects that make up this bias are real and not merely data artifacts. For example, the reduced risk for asthmatics of dying from pneumonia in Pittsburgh hospitals is a real effect and not an artifact of data acquisition or data analysis.

The source of target specification bias is the fact that labels in datasets acquired from the actual world are, at best, imperfect operationalizations of the counterfactually specified variables that decision-makers care about. Values of counterfactually specified variables are, by being counterfactual, not directly accessible from datasets that are obtained from the actual world. Rather, values of counterfactually specified variables must be inferred from data using domain-specific background knowledge. 

Target specification bias persists undetected largely due to an overly technical and simplistic conception of accuracy currently prevalent in supervised ML. This ‘label-matching’ conception takes accuracy to strictly track matches (or distances) between predictions and labels. This conception underlies all commonly used accuracy measures in supervised ML, such as precision, recall, area under curve, F1 score, and mean squared error. Such metrics neglect that labels – even reliable and representative – can be poor benchmarks for assessing the accuracy of counterfactual predictions. Yet counterfactual predictions are the kinds of predictions decision-makers typically care about. The upshot is that model accuracy is often overestimated and reported as being higher than the model’s performance in the use cases which decision-makers will employ them for.

How can target specification bias be mitigated? 

There is good news: much can be done to mitigate target specification bias. On a conceptual level, alternative conceptions of accuracy are available. Specifically, metrology has a long-standing tradition of thinking about accuracy in counterfactual terms. Metrology, the science of measurement, employs idealized models of instruments, such as clocks and thermometers, and uses such models to evaluate their accuracy. Such models appeal to background causal knowledge to predict the instrument’s indications in the absence of extrinsic influences. The successful standardization and reproducible measurement of physical quantities such as time, length, and mass are largely due to this counterfactual way of thinking about accuracy. 

On a practical level, the paper identifies several lessons supervised ML can learn from metrology about evaluating model accuracy and mitigating target specification bias. These insights can be combined with existing methods of causal modeling that reveal counterfactual probabilities in the data and with methods for extracting counterfactual information from ML models and presenting this information to users. 

Between the lines

Thinking about model accuracy in more sophisticated and user-oriented ways can be helpful beyond the realm of medicine. Such a shift would mark an important step in the maturity of the ML discipline as a whole, from its current exploratory stage toward producing a body of reliable, reproducible evidence for science and policy. A precondition for this sort of shift is a clearer distinction between the internal and external validity of supervised ML models. Internal validation procedures, such as testing models for under- or over-fitting training data, are considered sufficient for evaluating a model’s performance. The degree of fit between predictions and data is called ‘accuracy’ and is reported to users as such. This practice neglects external validation procedures, which are required to test model performance in the ‘wild,’ in light of the tool’s intended purpose, typical use cases, typical input data, and its reception by stakeholders.

Accuracy, as reported to users as a criterion of overall model performance, is an external validity criterion. It needs to be evaluated relative to users’ specifications and meet reproducibility requirements. Further work is needed to develop a framework for evaluating ML model accuracy in the ‘wild’ and reporting it to users in a manner most relevant to their values and goals.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

Beyond Consultation: Building Inclusive AI Governance for Canada’s Democratic Future

AI Policy Corner: U.S. Executive Order on Advancing AI Education for American Youth

AI Policy Corner: U.S. Copyright Guidance on Works Created with AI

AI Policy Corner: AI for Good Summit 2025

AI Policy Corner: Japan’s AI Promotion Act

related posts

  • AI Certification: Advancing Ethical Practice by Reducing Information Asymmetries

    AI Certification: Advancing Ethical Practice by Reducing Information Asymmetries

  • Technological trajectories as an outcome of the structure-agency interplay at the national level: In...

    Technological trajectories as an outcome of the structure-agency interplay at the national level: In...

  • Research summary:  Algorithmic Bias: On the Implicit Biases of Social Technology

    Research summary: Algorithmic Bias: On the Implicit Biases of Social Technology

  • Best humans still outperform artificial intelligence in a creative divergent thinking task

    Best humans still outperform artificial intelligence in a creative divergent thinking task

  • AI vs. Maya Angelou: Experimental Evidence That People Cannot Differentiate AI-Generated From Human-...

    AI vs. Maya Angelou: Experimental Evidence That People Cannot Differentiate AI-Generated From Human-...

  • HAI Weekly Seminar Series: Decolonizing AI with Sabelo Mhlambi

    HAI Weekly Seminar Series: Decolonizing AI with Sabelo Mhlambi

  • Who is afraid of black box algorithms? On the epistemological and ethical basis of trust in medical ...

    Who is afraid of black box algorithms? On the epistemological and ethical basis of trust in medical ...

  • Research summary: On the Edge of Tomorrow: Canada’s AI Augmented Workforce

    Research summary: On the Edge of Tomorrow: Canada’s AI Augmented Workforce

  • Humans are not Boltzmann Distributions: Challenges and Opportunities for Modelling Human Feedback an...

    Humans are not Boltzmann Distributions: Challenges and Opportunities for Modelling Human Feedback an...

  • Research summary: Overcoming Barriers to Cross-Cultural Cooperation in AI Ethics and Governance

    Research summary: Overcoming Barriers to Cross-Cultural Cooperation in AI Ethics and Governance

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer

Categories


• Blog
• Research Summaries
• Columns
• Core Principles of Responsible AI
• Special Topics

Signature Content


• The State Of AI Ethics

• The Living Dictionary

• The AI Ethics Brief

Learn More


• About

• Open Access Policy

• Contributions Policy

• Editorial Stance on AI Tools

• Press

• Donate

• Contact

The AI Ethics Brief (bi-weekly newsletter)

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.


Archive

  • © 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.