• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Core Principles of Responsible AI
    • Accountability
    • Fairness
    • Privacy
    • Safety and Security
    • Sustainability
    • Transparency
  • Special Topics
    • AI in Industry
    • Ethical Implications
    • Human-Centered Design
    • Regulatory Landscape
    • Technical Methods
  • Living Dictionary
  • State of AI Ethics
  • AI Ethics Brief
  • 🇫🇷
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

Research summary: What does it mean for ML to be trustworthy?

July 27, 2020

Summary contributed by Connor Wright, who’s a 3rd year Philosophy student at the University of Exeter.

Link to original source + author at the bottom.


Mini-summary: With the world being increasingly governed by machine learning algorithms, how can we make sure we can trust in this process? Nicolas Papernot’s 33-minute video provides his and his group’s findings on how to do just that. Ranging from topics of robustness, LP norms, differential privacy and deepfakes, the video focuses on two areas of making ML trustworthy. These are admission control at test time, and model governance. Considered at length, both are proposed as ways forward for helping make ML more trustworthy through their improvement on privacy, but they are not full-proof. A thought-provoking conclusion is then achieved to do with the alignment of human and ML norms.


Full summary:

This video provides Nicolas Papernot’s presentation of his involvement in a project on making ML more trustworthy. This included appeals to LP norms, differential privacy, admission control at test time, model governance, and deepfakes. These mentions will be dealt with as sections of the summary, ending with Papernot’s conclusions from his research.

Trustworthy ML:

In order to determine how to make ML more trustworthy, the group needed to determine what that would look like. To do this, they sought to include how a ML model could be robust against the threat of adversarial examples. Here, LP norms were utilised, placing the model as a constant predictor inside an LP ball, and thus making it less sensitive to perturbations. The LP norms then allowed for a new way of detecting adverse examples to be proposed, for they were able to be detected more clearly through their excessive exploitation of the excessive invariance that the LP norms had coded into the model. Hence, the question is asked whether such a method can be used to better detect threats through this resultant excessive exploitation, and then help train models to be robust against such threats in the future.

This is answered through framing the question in terms of an AI arms race. Traditional computer systems have treated cybersecurity in terms of ‘locking up a house’. They ‘lock the door’ in order to prevent intruders, and weigh this up against additionally ‘locking the window’ in case a bear was to break through. If this were to happen, the windows would then be locked, and weighed up against the possibility of a hawk descending through the chimney. If this were to happen, then the chimney would be locked etc. creating an AI arms race to try and defend against these threats. Instead, using LP norms to increase the robustness of the model to better detect these intruders, bears and hawks could be a way forward that saves time, money, and increases trust in ML.

Privacy:

One way to increase the trust using this model is in terms of privacy. ML lives on data, and thus it is sometimes at risk of the data subject wanting to privatise such data, even once it’s it has helped train the model. Thus, the group leant towards a definition of privacy as “differential privacy” through their Private Aggregation of Teacher Ensembles (PATE) method. Here, instead of training the model on a whole data set, the group split the data up and assigned “teachers” to each data partition. These partitions aren’t related, so each teacher is trained independently to solve the same task of the model, whereby the group can aggregate the predictions made by the model, and then add some noise to the data in order to make the predictions more private. Hence, the data is only included in one of the data sets and only influences one of the teachers, meaning that if there’s a prediction being made, your data is very unlikely to influence said prediction as you will only impact one of the predictions made. Resultantly, the data can be privatised more effectively through having less of an impact on the output, helping to align ML’s version of privacy to the human norms that fall under the same term.

The video then splits into two main topic areas: admission control at test time, and model governance, which I shall now introduce in turn.

Admission control at test time:

One way of abstaining from making a prediction within a model to help aid admission control is to promote a measure of uncertainty within the model, which the group linked back to the training data. The group then implemented the method of Deep K-Nearest Neighbours, which allowed them to open up the “black box” of the model and see what was occurring in each layer (focusing on individual problems, rather than the model itself). Here, they looked at how each layer represents their test input, and in each of the layer’s representation space, they performed a nearest neighbour search. This is done until each layer is completed, whereby the nearest neighbour search will reveal how the labels of the nearest neighbour are inconsistent, and thus point to the problem.

One question then arises as to whether there’s a possibility of defining objectives at the level of each layer to make models more amenable to test-time predictions. The group looked at this question in their use of ‘soft nearest neighbour loss’ in 2019’s ICML conference. They asked whether it was better for the model to learn representation spaces that separate the data from classes with a large margin (like a support vector machine), or whether it’s better to entangle different classes together within the layers of representations.

They found that the latter was better for the Deep K-Nearest Neighbours method to estimate the uncertainty of the model. So, they then introduced the ‘soft nearest-neighbour loss’ into the model to encourage it to entangle points from different classes in the layers of representations. The soft loss method will then encourage the model to co-opt features between different classes and the lower layers (which are able to be recognised using the same lower level features). This then helped them identify uncertainty when they have a test point that doesn’t fall in any of the clusters, meaning they would’ve had to have guessed whom their nearest neighbours are. Instead, there’s now support when doing these searches as the different classes are subsequently entangled.

Model governance:

The group explored the problem of how to return data to a data subject that no longer wants their data to be utilised, despite it already being used to train the model, requiring a form of “machine-unlearning”. The question then becomes: is the differential privacy proposed enough to prevent such time-consuming machine-unlearning from occurring? As in, can the data being subjected to only affecting the outcome of one teacher be enough to satisfy what machine-unlearning would achieve?

The group decided that this probably wouldn’t be the case. The model’s data points (like in stochastic gradient descent) would still be influenced by the initial data point of the data subject, thus making it hard to remove their data entirely. Hence, Shared Isolated Sliced Aggregate training is then explored. Here, the method involves splitting the model into shards, and those shards into slices, meaning that the data point will only be in one shard and in one slice of that shard, whereby only one shard will need to be retrained rather than the whole model. Such retraining will then be quicker, for each shard completion contains a checkpoint before moving onto the next shard, proving a launching pad for retraining the model.

Deepfakes:

The notion of ML’s role in deepfakes is then considered, with progress in ML accelerating the progress in digital alteration. The group considered 3 approaches as how to combat this:

  1. Detect artifacts within the altered image (such as detecting imperfections, like imperfect body movements).
  2. Reveal content provenance (secure record of all entities and systems that manipulate a particular piece of content).
  3. Advocate a notion of total accountability (record every minute of your life).

The group believed that none of these 3 methods would achieve total coverage of all the problematic areas of deepfakes, so the supplementation of policy on areas such as predictive policing and feedback loops is required.

Conclusion:

The group concluded that research is needed in order to align ML with human norms. Once this is done, trustworthy ML is an opportunity to make ML better, and a cause that provides much food for thought for the future.


Original presentation by Nicolas Papernot et al.: https://youtu.be/UpGgIqLhaqo 

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

AI Policy Corner: Frontier AI Safety Commitments, AI Seoul Summit 2024

AI Policy Corner: The Colorado State Deepfakes Act

Special Edition: Honouring the Legacy of Abhishek Gupta (1992–2024)

AI Policy Corner: The Turkish Artificial Intelligence Law Proposal

From Funding Crisis to AI Misuse: Critical Digital Rights Challenges from RightsCon 2025

related posts

  • NIST Special Publication 1270: Towards a Standard for Identifying and Managing Bias in Artificial In...

    NIST Special Publication 1270: Towards a Standard for Identifying and Managing Bias in Artificial In...

  • Research summary: AI Governance: A Holistic Approach to Implement Ethics in AI

    Research summary: AI Governance: A Holistic Approach to Implement Ethics in AI

  • The State of AI Ethics Report (Volume 6)

    The State of AI Ethics Report (Volume 6)

  • Risk and Trust Perceptions of the Public of Artificial Intelligence Applications

    Risk and Trust Perceptions of the Public of Artificial Intelligence Applications

  • Mapping the Responsible AI Profession, A Field in Formation (techUK)

    Mapping the Responsible AI Profession, A Field in Formation (techUK)

  • Incentivized Symbiosis: A Paradigm for Human-Agent Coevolution

    Incentivized Symbiosis: A Paradigm for Human-Agent Coevolution

  • Russia’s Artificial Intelligence Strategy: The Role of State-Owned Firms

    Russia’s Artificial Intelligence Strategy: The Role of State-Owned Firms

  • Research summary: Adversarial Machine Learning - Industry Perspectives

    Research summary: Adversarial Machine Learning - Industry Perspectives

  • Demographic-Reliant Algorithmic Fairness: Characterizing the Risks of Demographic Data Collection an...

    Demographic-Reliant Algorithmic Fairness: Characterizing the Risks of Demographic Data Collection an...

  • Mapping the Ethicality of Algorithmic Pricing

    Mapping the Ethicality of Algorithmic Pricing

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer

Categories


• Blog
• Research Summaries
• Columns
• Core Principles of Responsible AI
• Special Topics

Signature Content


• The State Of AI Ethics

• The Living Dictionary

• The AI Ethics Brief

Learn More


• About

• Open Access Policy

• Contributions Policy

• Editorial Stance on AI Tools

• Press

• Donate

• Contact

The AI Ethics Brief (bi-weekly newsletter)

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.


Archive

  • Š MONTREAL AI ETHICS INSTITUTE. All rights reserved 2024.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.