Breaking Your Neural Network with Adversarial Examples

Written by Kenny Song (@helloksong). Co-founder of Citadel AI.

Fundamentally, a machine learning model is just a software program: it takes an input, steps through a series of computations, and produces an output. In fact, all software has bugs and vulnerabilities, and machine learning is no exception.

One prominent bug – and security vulnerability – in current machine learning systems is the existence of adversarial examples. An attacker can carefully craft an input to the system to make it predict anything the attacker wants.

For example, by tweaking a few pixels in the source image, we can make a neural network think this “Stop” sign is a “120 km/hr” sign, with 99.9% confidence.

Beyond misclassifying street signs, attackers could use this to:

Impersonate others in facial recognition systems
Bypass content moderation and spam filters in social networks
Inject adversarial bytes into malware to bypass antivirus systems

This problem is well-known in the academic community, with thousands of published papers. Yet few practitioners invest resources to defend their ML systems against these attacks. This is partially a visibility problem – most of this knowledge is locked inside research literature.

To increase awareness of these risks, I created adversarial.js, a library of adversarial attacks in JavaScript. It has an interactive demo that generates adversarial examples in your browser. No installation, no manual, just open the webpage and start playing.

Hopefully, by showcasing these attacks in an easy-to-understand way, we can help others discover this failure mode of machine learning. In particular, I hope that it motivates practitioners and real-world system owners to consider these risks & defenses.

What are the defenses? There are several proposals, such as adversarial training or admission control. Some are implemented in open-source libraries including CleverHans, Foolbox, or ART. However, no method is universal and many have proven ineffective, so work with an expert to invest in your defenses appropriately.

To learn more about adversarial examples, check out the library FAQ, or get in touch with the author.

Breaking Your Neural Network with Adversarial Examples

Episodio 3 - Idoia Salazar: Sobre la Vital Importancia de Educar al Ciudadano en los Usos Responsabl...

On Prediction-Modelers and Decision-Makers: Why Fairness Requires More Than a Fair Prediction Model

The Participatory Turn in AI Design: Theoretical Foundations and the Current State of Practice

Defining a Research Testbed for Manned-Unmanned Teaming Research

Towards Climate Awareness in NLP Research

Human-AI Interactions and Societal Pitfalls

De-platforming disinformation: conspiracy theories and their control

Algorithmic Auditing and Social Justice: Lessons from the History of Audit Studies

Research summary: Artificial Intelligence: The Ambiguous Labor Market Impact of Automating Predictio...

Research summary: The Toxic Potential of YouTube's Feedback Loop

Categories

Signature Content

Learn More

The AI Ethics Brief (bi-weekly newsletter)

About Us

Archive