• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
    • Tech Futures
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • Volume 7 (November 2025)
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

Robustness and Usefulness in AI Explanation Methods

March 15, 2022

🔬 Research summary by Connor Wright, our Partnerships Manager.

[Original paper by Erick Galinkin]


Overview: Given the hype received by explainability methods to offer insights into black-box algorithms, many have opted-in to their use. Yet, this work aims to show how their implementation is not always appropriate, with the methods at hand possessing some apparent downfalls.


Introduction

Explainability methods within the machine learning (ML) space have received much attention, given their potential to provide insight into black-box algorithms. Explanation methods can influence a consumer’s trust in a model, increasing the speed at which auditors carry out their work. However, some practitioners rely too heavily on these tools, jeopardising their use. This work focuses specifically on post-hoc explanation methods about trained models, reflecting on the methods of LIME, SmoothGrad and SHAP. Having explained what these methods entail, the paper evaluates what they offer and their potential downfalls.

Key Insights

Local interpretable model agnostic explanations (LIME)

LIME aims to explain the outcomes of any classifier no matter its “complexity” or “linearity” (p. 2). This is done by deriving an interpretable representation similar to the actual model.  The representation is taken from a selection of different interpretable representations generated by LIME. It then perturbs (generates noise) around every output of interest and labels it. The complexity of the model limits the number of representations generated.

SmoothGrad

SmoothGrad is an extension to any gradient-based explainability method, rather than being one itself. Mainly used in convolutional neural networks in image recognition, it was introduced to solve previous problems associated with gradient methods. It seeks to generate multiple versions of an image through perturbation before averaging them together. This helps smooth over any gradient fluctuations between images.

Shapley Additive Explanations (SHAP)

SHAP uses game theory to understand model predictions. It employs Shapley evaluation methods to assign a value to each prediction.

Having explained the three methods at hand, they fell under two different methodologies: perturbation (adds noise to inputs and assesses small changes in the model space) and gradient (uses the gradients produced by different inputs to determine the model outcome). Their evaluation was as follows (p. 4):

Criterion 1: Local explanations

Local explanation sought to capture the ability of explanation methods to decipher and understand examples from the data. Its importance lies in how most of the adverse effects felt by individuals come from individual predictions made by the model. All three explanation methods offer this feature.

Criterion 2: Visualize

Models that can produce accessible and quickly comprehensible visualisations are hugely valuable. SmoothGrad creates heatmaps in conjunction with the gradient methods used in convolutional neural networks. SHAP comes with a Python package that generates force plots to see how each feature affects the model. This can be applied to the whole dataset, displaying how a particular data feature affects the entire dataset.

However, LIME does not offer this feature. Only the attributes of the more interpretable model generated can be visualized, rather than any insight into the larger model at hand.

Criterion 3: Model-agnostic

Model agnostic methods aim to take charge of the different formats in which data input can come (whether in video, text or image). SHAP and LIME both build on this, acting across their different interpretable models. However, SmoothGrad needs to have differentiable models rather than representations.

Weakness of explainability methods

Galinkin points out that the core question which explanation methods ought to solve is the following: to whom and under what circumstances is the model interpretable? Two types of factors affect a model’s interpretability:

Human factors

Reaching for fairness in ML often involves a practitioner visualising an ideal world, which, given its subjectivity, can be to the detriment of specific populations. Should they use general notions of fairness, this could then leave everyone worse off. For example, an ideal world in which a facial recognition system doesn’t account for people’s skin colour, resulting in recurrent misidentifications.

The robustness of explainer methods

Perturbation methods are worse at explaining small changes to the data output than gradient methods, meaning they are less robust. Nevertheless, practitioners are sometimes hooked by the pleasing representations these methods produce, rather than other forms of representation that provide insight into the model. In this sense, practitioners can be guilty of over-relying and over-using these explanation methods. Instead, there can be more insightful representations offered by other methods. Sometimes, a linear model fits just as well.

Between the lines

I think this paper does really well to highlight the extent to which practitioners can be blinded by the hype surrounding different methodologies. At times, despite a specific method offering pleasant representations, its employment may not be necessary. Instead of focusing on what the explainer method can offer, perhaps we should first consider it against other methods to see what work it is, doing for our models. There is a difference between being useful and being appropriate in this sense. While explainer methods may be useful, they may not always be the appropriate choice.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

A network diagram with lots of little emojis, organised in clusters.

Tech Futures: AI For and Against Knowledge

A brightly coloured illustration which can be viewed in any direction. It has many elements to it working together: men in suits around a table, someone in a data centre, big hands controlling the scenes and holding a phone, people in a production line. Motifs such as network diagrams and melting emojis are placed throughout the busy vignettes.

Tech Futures: The Fossil Fuels Playbook for Big Tech: Part II

A rock embedded with intricate circuit board patterns, held delicately by pale hands drawn in a ghostly style. The contrast between the rough, metallic mineral and the sleek, artificial circuit board illustrates the relationship between raw natural resources and modern technological development. The hands evoke human involvement in the extraction and manufacturing processes.

Tech Futures: The Fossil Fuels Playbook for Big Tech: Part I

Close-up of a cat sleeping on a computer keyboard

Tech Futures: The threat of AI-generated code to the world’s digital infrastructure

The undying sun hangs in the sky, as people gather around signal towers, working through their digital devices.

Dreams and Realities in Modi’s AI Impact Summit

related posts

  • AI Safety, Security, and Stability Among Great Powers (Research Summary)

    AI Safety, Security, and Stability Among Great Powers (Research Summary)

  • Experts Doubt Ethical AI Design Will Be Broadly Adopted as the Norm Within the Next Decade

    Experts Doubt Ethical AI Design Will Be Broadly Adopted as the Norm Within the Next Decade

  • A Generalist Agent

    A Generalist Agent

  • Collective Action on Artificial Intelligence: A Primer and Review

    Collective Action on Artificial Intelligence: A Primer and Review

  • Knowledge, Workflow, Oversight: A framework for implementing AI ethics

    Knowledge, Workflow, Oversight: A framework for implementing AI ethics

  • Artificial Intelligence and Aesthetic Judgment

    Artificial Intelligence and Aesthetic Judgment

  • Submission to World Intellectual Property Organization on IP & AI

    Submission to World Intellectual Property Organization on IP & AI

  • Equal Improvability: A New Fairness Notion Considering the Long-term Impact

    Equal Improvability: A New Fairness Notion Considering the Long-term Impact

  • How Machine Learning Can Enhance Remote Patient Monitoring

    How Machine Learning Can Enhance Remote Patient Monitoring

  • The philosophical basis of algorithmic recourse

    The philosophical basis of algorithmic recourse

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • © 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.