• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
    • Tech Futures
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • Volume 7 (November 2025)
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

Robustness and Usefulness in AI Explanation Methods

March 15, 2022

🔬 Research summary by Connor Wright, our Partnerships Manager.

[Original paper by Erick Galinkin]


Overview: Given the hype received by explainability methods to offer insights into black-box algorithms, many have opted-in to their use. Yet, this work aims to show how their implementation is not always appropriate, with the methods at hand possessing some apparent downfalls.


Introduction

Explainability methods within the machine learning (ML) space have received much attention, given their potential to provide insight into black-box algorithms. Explanation methods can influence a consumer’s trust in a model, increasing the speed at which auditors carry out their work. However, some practitioners rely too heavily on these tools, jeopardising their use. This work focuses specifically on post-hoc explanation methods about trained models, reflecting on the methods of LIME, SmoothGrad and SHAP. Having explained what these methods entail, the paper evaluates what they offer and their potential downfalls.

Key Insights

Local interpretable model agnostic explanations (LIME)

LIME aims to explain the outcomes of any classifier no matter its “complexity” or “linearity” (p. 2). This is done by deriving an interpretable representation similar to the actual model.  The representation is taken from a selection of different interpretable representations generated by LIME. It then perturbs (generates noise) around every output of interest and labels it. The complexity of the model limits the number of representations generated.

SmoothGrad

SmoothGrad is an extension to any gradient-based explainability method, rather than being one itself. Mainly used in convolutional neural networks in image recognition, it was introduced to solve previous problems associated with gradient methods. It seeks to generate multiple versions of an image through perturbation before averaging them together. This helps smooth over any gradient fluctuations between images.

Shapley Additive Explanations (SHAP)

SHAP uses game theory to understand model predictions. It employs Shapley evaluation methods to assign a value to each prediction.

Having explained the three methods at hand, they fell under two different methodologies: perturbation (adds noise to inputs and assesses small changes in the model space) and gradient (uses the gradients produced by different inputs to determine the model outcome). Their evaluation was as follows (p. 4):

Criterion 1: Local explanations

Local explanation sought to capture the ability of explanation methods to decipher and understand examples from the data. Its importance lies in how most of the adverse effects felt by individuals come from individual predictions made by the model. All three explanation methods offer this feature.

Criterion 2: Visualize

Models that can produce accessible and quickly comprehensible visualisations are hugely valuable. SmoothGrad creates heatmaps in conjunction with the gradient methods used in convolutional neural networks. SHAP comes with a Python package that generates force plots to see how each feature affects the model. This can be applied to the whole dataset, displaying how a particular data feature affects the entire dataset.

However, LIME does not offer this feature. Only the attributes of the more interpretable model generated can be visualized, rather than any insight into the larger model at hand.

Criterion 3: Model-agnostic

Model agnostic methods aim to take charge of the different formats in which data input can come (whether in video, text or image). SHAP and LIME both build on this, acting across their different interpretable models. However, SmoothGrad needs to have differentiable models rather than representations.

Weakness of explainability methods

Galinkin points out that the core question which explanation methods ought to solve is the following: to whom and under what circumstances is the model interpretable? Two types of factors affect a model’s interpretability:

Human factors

Reaching for fairness in ML often involves a practitioner visualising an ideal world, which, given its subjectivity, can be to the detriment of specific populations. Should they use general notions of fairness, this could then leave everyone worse off. For example, an ideal world in which a facial recognition system doesn’t account for people’s skin colour, resulting in recurrent misidentifications.

The robustness of explainer methods

Perturbation methods are worse at explaining small changes to the data output than gradient methods, meaning they are less robust. Nevertheless, practitioners are sometimes hooked by the pleasing representations these methods produce, rather than other forms of representation that provide insight into the model. In this sense, practitioners can be guilty of over-relying and over-using these explanation methods. Instead, there can be more insightful representations offered by other methods. Sometimes, a linear model fits just as well.

Between the lines

I think this paper does really well to highlight the extent to which practitioners can be blinded by the hype surrounding different methodologies. At times, despite a specific method offering pleasant representations, its employment may not be necessary. Instead of focusing on what the explainer method can offer, perhaps we should first consider it against other methods to see what work it is, doing for our models. There is a difference between being useful and being appropriate in this sense. While explainer methods may be useful, they may not always be the appropriate choice.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

A rock embedded with intricate circuit board patterns, held delicately by pale hands drawn in a ghostly style. The contrast between the rough, metallic mineral and the sleek, artificial circuit board illustrates the relationship between raw natural resources and modern technological development. The hands evoke human involvement in the extraction and manufacturing processes.

Tech Futures: The Fossil Fuels Playbook for Big Tech: Part I

Close-up of a cat sleeping on a computer keyboard

Tech Futures: The threat of AI-generated code to the world’s digital infrastructure

The undying sun hangs in the sky, as people gather around signal towers, working through their digital devices.

Dreams and Realities in Modi’s AI Impact Summit

Illustration of a coral reef ecosystem

Tech Futures: Diversity of Thought and Experience: The UN’s Scientific Panel on AI

This image shows a large white, traditional, old building. The top half of the building represents the humanities (which is symbolised by the embedded text from classic literature which is faintly shown ontop the building). The bottom section of the building is embossed with mathematical formulas to represent the sciences. The middle layer of the image is heavily pixelated. On the steps at the front of the building there is a group of scholars, wearing formal suits and tie attire, who are standing around at the enternace talking and some of them are sitting on the steps. There are two stone, statute-like hands that are stretching the building apart from the left side. In the forefront of the image, there are 8 students - which can only be seen from the back. Their graduation gowns have bright blue hoods and they all look as though they are walking towards the old building which is in the background at a distance. There are a mix of students in the foreground.

Tech Futures: Co-opting Research and Education

related posts

  • The Social Contract for AI

    The Social Contract for AI

  • “Cool Projects” or “Expanding the Efficiency of the Murderous American War Machine?” (Research Summa...

    “Cool Projects” or “Expanding the Efficiency of the Murderous American War Machine?” (Research Summa...

  • Responsible Use of Technology in Credit Reporting: White Paper

    Responsible Use of Technology in Credit Reporting: White Paper

  • Knowing Your Annotator: Rapidly Testing the Reliability of Affect Annotation

    Knowing Your Annotator: Rapidly Testing the Reliability of Affect Annotation

  • The Ethical Implications of Generative Audio Models: A Systematic Literature Review

    The Ethical Implications of Generative Audio Models: A Systematic Literature Review

  • Technical methods for regulatory inspection of algorithmic systems in social media platforms

    Technical methods for regulatory inspection of algorithmic systems in social media platforms

  • Jack Clark Presenting the 2022 AI Index Report

    Jack Clark Presenting the 2022 AI Index Report

  • Ethics as a service: a pragmatic operationalisation of AI Ethics

    Ethics as a service: a pragmatic operationalisation of AI Ethics

  • Embedding Values in Artificial Intelligence (AI) Systems

    Embedding Values in Artificial Intelligence (AI) Systems

  • The Ethics of Emotion in AI Systems (Research Summary)

    The Ethics of Emotion in AI Systems (Research Summary)

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • © 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.