• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Core Principles of Responsible AI
    • Accountability
    • Fairness
    • Privacy
    • Safety and Security
    • Sustainability
    • Transparency
  • Special Topics
    • AI in Industry
    • Ethical Implications
    • Human-Centered Design
    • Regulatory Landscape
    • Technical Methods
  • Living Dictionary
  • State of AI Ethics
  • AI Ethics Brief
  • 🇫🇷
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

Research summary: Evasion Attacks Against Machine Learning at Test Time

June 8, 2020

Summary contributed by Erick Galinkin (@ErickGalinkin), Principal AI Researcher at Rapid7

(Authors of full paper & link at the bottom)


Machine learning adoption is widespread and in the field of security, applications such as spam filtering, malware detection, and intrusion detection are becoming increasingly reliant on machine learning techniques. Since these environments are naturally adversarial, defenders cannot rely on the assumption that underlying data distributions are stationary. Instead, machine learning practitioners in the security domain must adopt paradigms from cryptography and security engineering to deal with these systems in adversarial settings.

Previously, approaches such as min-max and Nash equilibrium have been used to consider attack scenarios. However, realistic constraints are far more complex than these frameworks allow, and so we instead see how practitioners can understand how classification performance is degraded under attack. This allows us to better design algorithms to detect what we want in an environment where attackers seek to reduce our ability to correctly classify examples. Specifically, this work considers attacks on classifiers which are not necessarily linear or convex.

To simulate attacks, two strategies are undertaken:

  1. “Perfect Knowledge” – this is a conventional “white box” attack where attackers have perfect knowledge of the feature space, the trained model itself, the classifier, the training data, and can transform attack points in the test data within a distance of dₘₐₓ.
  2. “Limited Knowledge” – In this “grey box” attack, the adversary still has knowledge of the classifier type and feature space but cannot directly compute the discriminant function g(x). Instead, they must compute a surrogate function from data not in the training set, but from the same underlying distribution.

The attacker’s strategy is to minimize the discriminant function g(x) or the corresponding surrogate function in the limited knowledge case. In order to overcome failure cases for gradient descent-based approaches, a density estimator is introduced which penalizes the model in low-density regions. This component is known as “mimicry” and is parametrized by λ, a trade-off parameter. When λ is 0, no mimicry is used, and as λ increases, the attack sample becomes more similar to the target class. In the case of images, this can make the attack sample unrecognizable to humans.

The first “toy” example used is MNIST, where an image which is obviously a “3” to human observers is reliably misclassified as the target class “7” against a support vector machine.

The task of discriminating between malicious and benign PDF files was also addressed, relying on the ease of inserting new objects to a PDF file as a method of controlling dₘₐₓ. For the limited knowledge case, a surrogate dataset 20% of the size of the training data was used. For SVMs with both linear and RBF kernels, both perfect knowledge and limited knowledge attacks were highly successful both with and without mimicry, in as few as 5 modifications. For the neural network classifiers, the attacks without mimicry were not very successful, though the perfect knowledge attacks with mimicry were highly successful.

The authors suggest many avenues for further research, including using the mimicry term as a search heuristic; building small but representative sets of surrogate data; and using ensemble techniques such as bagging or random subspace methods to train several classifiers.


Original paper by Battista Biggio, Igino Corona, Davide Maiorca, Blaine Nelson, Nedim Srndic, Pavel Laskov, Giorgio Giacinto, and Fabio Roli: https://arxiv.org/abs/1708.06131 

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

AI Policy Corner: Frontier AI Safety Commitments, AI Seoul Summit 2024

AI Policy Corner: The Colorado State Deepfakes Act

Special Edition: Honouring the Legacy of Abhishek Gupta (1992–2024)

AI Policy Corner: The Turkish Artificial Intelligence Law Proposal

From Funding Crisis to AI Misuse: Critical Digital Rights Challenges from RightsCon 2025

related posts

  • A Critical Analysis of the What3Words Geocoding Algorithm

    A Critical Analysis of the What3Words Geocoding Algorithm

  • Implications of the use of artificial intelligence in public governance: A systematic literature rev...

    Implications of the use of artificial intelligence in public governance: A systematic literature rev...

  • Reduced, Reused, and Recycled: The Life of a Benchmark in Machine Learning Research

    Reduced, Reused, and Recycled: The Life of a Benchmark in Machine Learning Research

  • The Values Encoded in Machine Learning Research

    The Values Encoded in Machine Learning Research

  • Research summary: Classical Ethics in A/IS

    Research summary: Classical Ethics in A/IS

  • Automated Interviewer or Augmented Survey? Collecting Social Data with Large Language Models

    Automated Interviewer or Augmented Survey? Collecting Social Data with Large Language Models

  • Energy and Policy Considerations in Deep Learning for NLP

    Energy and Policy Considerations in Deep Learning for NLP

  • Unlocking Accuracy and Fairness in Differentially Private Image Classification

    Unlocking Accuracy and Fairness in Differentially Private Image Classification

  • Low-Resource Languages Jailbreak GPT-4

    Low-Resource Languages Jailbreak GPT-4

  • South Korea as a Fourth Industrial Revolution Middle Power?

    South Korea as a Fourth Industrial Revolution Middle Power?

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer

Categories


• Blog
• Research Summaries
• Columns
• Core Principles of Responsible AI
• Special Topics

Signature Content


• The State Of AI Ethics

• The Living Dictionary

• The AI Ethics Brief

Learn More


• About

• Open Access Policy

• Contributions Policy

• Editorial Stance on AI Tools

• Press

• Donate

• Contact

The AI Ethics Brief (bi-weekly newsletter)

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.


Archive

  • © MONTREAL AI ETHICS INSTITUTE. All rights reserved 2024.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.