• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
    • Tech Futures
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • Volume 7 (November 2025)
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

Research Summary: Explaining and Harnessing Adversarial Examples

June 28, 2020

Summary contributed by Shannon Egan, Research Fellow at Building 21 and pursuing a master’s in physics at UBC.

*Author & link to original paper at the bottom.


Click here for the FULL summary in PDF form

(Short-form summary below)

A bemusing weakness of many supervised machine learning (ML) models, including neural networks (NNs), are adversarial examples (AEs).  AEs are inputs generated by adding a small perturbation to a correctly-classified input, causing the model to misclassify the resulting AE with high confidence.  Goodfellow et al. propose a linear explanation of AEs, in which the vulnerability of ML models to AEs is considered a by-product of their linear behaviour and high-dimensional feature space.  In other words, small perturbations on an input can alter its classification because the change in NN activation (as result of the perturbation) scales with the size of the input vector.

Identifying ways to effectively handle AEs is of interest for problems like image classification, where the input consists of intensity data for many thousands of pixels.  A method of generating AEs called “fast gradient sign method” badly fools a maxout network, leading to a 89.4% error rate on a perturbed MNIST test set.  The authors propose an “adversarial training” scheme for NNs, in which an adversarial term is added to the loss function during training. 

This dramatically improves the error rate of the same maxout network to 17.4% on AEs generated by the fast gradient sign method. The linear interpretation of adversarial examples suggests an approach to adversarial training which improves a model’s ability to classify AEs, and helps interpret properties of AE classification which the previously proposed nonlinearity and overfitting hypotheses do not explain. 


Click here for the full summary in PDF form.

Original paper by Ian J. Goodfellow, Jonathan Shlens and Christian Szegedy: https://arxiv.org/abs/1412.6572

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

Close-up of a cat sleeping on a computer keyboard

Tech Futures: The threat of AI-generated code to the world’s digital infrastructure

The undying sun hangs in the sky, as people gather around signal towers, working through their digital devices.

Dreams and Realities in Modi’s AI Impact Summit

Illustration of a coral reef ecosystem

Tech Futures: Diversity of Thought and Experience: The UN’s Scientific Panel on AI

This image shows a large white, traditional, old building. The top half of the building represents the humanities (which is symbolised by the embedded text from classic literature which is faintly shown ontop the building). The bottom section of the building is embossed with mathematical formulas to represent the sciences. The middle layer of the image is heavily pixelated. On the steps at the front of the building there is a group of scholars, wearing formal suits and tie attire, who are standing around at the enternace talking and some of them are sitting on the steps. There are two stone, statute-like hands that are stretching the building apart from the left side. In the forefront of the image, there are 8 students - which can only be seen from the back. Their graduation gowns have bright blue hoods and they all look as though they are walking towards the old building which is in the background at a distance. There are a mix of students in the foreground.

Tech Futures: Co-opting Research and Education

Agentic AI systems and algorithmic accountability: a new era of e-commerce

related posts

  • Research summary: The Wrong Kind of AI? Artificial Intelligence and the Future of Labor Demand

    Research summary: The Wrong Kind of AI? Artificial Intelligence and the Future of Labor Demand

  • Response to the European Commission’s white paper on AI (2020)

    Response to the European Commission’s white paper on AI (2020)

  • Exploring Clusters of Research in Three Areas of AI Safety

    Exploring Clusters of Research in Three Areas of AI Safety

  • Abhishek Gupta on AI Ethics at the HBS Tech Conference (Keynote Summary)

    Abhishek Gupta on AI Ethics at the HBS Tech Conference (Keynote Summary)

  • On the Actionability of Outcome Prediction

    On the Actionability of Outcome Prediction

  • How to invest in Data and AI companies responsibly

    How to invest in Data and AI companies responsibly

  • Bias Amplification Enhances Minority Group Performance

    Bias Amplification Enhances Minority Group Performance

  • The Impact of Recommendation Systems on Opinion Dynamics: Microscopic versus Macroscopic Effects

    The Impact of Recommendation Systems on Opinion Dynamics: Microscopic versus Macroscopic Effects

  • Research summary: Beyond a Human Rights Based Approach To AI Governance: Promise, Pitfalls and Plea

    Research summary: Beyond a Human Rights Based Approach To AI Governance: Promise, Pitfalls and Plea

  • AI and Marketing: Why We Need to Ask Ethical Questions

    AI and Marketing: Why We Need to Ask Ethical Questions

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • © 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.