• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • Volume 7 (November 2025)
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

Deployment corrections: An incident response framework for frontier AI models

January 25, 2024

🔬 Research Summary by Joe O’Brien, an Associate Researcher at the Institute for AI Policy and Strategy, focusing on corporate governance and accountability surrounding developing and deploying frontier AI models.

[Original paper by Joe O’Brien, Shaun Ee, and Zoe Williams]


Overview: This report describes a toolkit that frontier AI developers can use to respond to risks discovered after the deployment of a model. It also provides a framework for AI developers to prepare and implement this toolkit.


Introduction

Recent history features plenty of cases where AI models have behaved or been used in unintended ways after model deployment. As AI capabilities progress and the scale of adoption of AI systems grows, the impacts of model deployments may become increasingly significant–and this may especially be the case for leading AI developers, such as OpenAI, Google DeepMind, Anthropic, Microsoft, Google, Amazon, and Meta. While AI developers can adopt several safety practices before deployment (such as red-teaming, risk assessment, and fine-tuning) to reduce the likelihood of incidents, these practices are unlikely to pre-empt all potential issues.

To manage this gap, this paper recommends that leading AI developers establish the capacity for “deployment corrections”–a set of tools to rapidly restrict access to a deployed model for all or part of its functionality and/or users. This would facilitate appropriate and fast responses to a) dangerous capabilities or behaviors identified in post-deployment risk assessment and monitoring and b) serious incidents. The paper also describes practices that can lower the barrier to making decisive, appropriate decisions on deployment corrections. 

Key Insights

As a toolkit

Frontier AI developers, which make their models available to downstream users via an interface (e.g., API rather than via open-sourcing), have many tools at their disposal to limit access to the model. At a high level, this toolkit includes:

  • User-based restrictions (such as blocklisting or allowlisting)
  • Access frequency restrictions (such as throttling the number of prompts that can be submitted to a model in a time period)
  • Capability restrictions (such as filtering harmful model outputs)
  • Use case restrictions (such as prohibiting a model’s use in high-stakes applications)
  • Full shutdown (such as decommissioning a model)

These tools can be used in a broad range of scenarios, from cases where risks from the model are fairly limited to scenarios where the harms are potentially severe and can arise even from proper use by a trusted user.

Restricting model access may be difficult in practice, as downstream users may become dependent on the capabilities of newly deployed models.  To minimize these harms and to lower the barrier for developers to institute deployment corrections as a precaution, we outline a space for deployment corrections to allow a scalable and targeted approach. AI developers can opt for combinations of restrictions and tailor these choices to respond effectively to specific incidents while minimizing downstream harms.

Building organizational capacity

Tools alone are insufficient for action–AI developers will need to develop procedures, roles, and responsibilities for managing decisions around deployment corrections to respond to incidents with their deployed models most effectively. The paper recommends that AI developers focus on four stages of implementation: preparation, monitoring, execution, and post-incident follow-up.

Preparation refers to the act of building and adopting the tools and procedures that will allow an AI developer to act swiftly and effectively in response to an incident. It includes identifying and understanding possible threats, establishing triggers for deployment corrections, developing tools and procedures for incident response, and establishing decision-making authorities. Externally, it includes sharing insights on best practices with regulators and industry partners and defining fallback options for downstream users in the case of service interruption.

Monitoring refers to the process of continuously gathering data on a model’s capabilities, behavior, and use (via a diverse range of sources), analyzing this data for anomalies, and escalating cases of concern to relevant decision-makers. AI developers should also feed relevant data back into the threat modeling process.

Execution refers to the decision to apply a deployment correction to a model and the procedures that follow this decision. This stage also includes alerting and coordinating with relevant regulatory authorities, implementing fallback systems for downstream users, and notifying customers of the situation.

Post-incident follow-up refers to the set of actions relevant to recovery, restoration, learning, and ongoing risk management in the wake of an incident. This stage involves the process of repairing a model and restoring service, after-action reviews, and feeding lessons back into the previous stages. In some cases, this stage may require significant involvement from external parties (such as when the incident is particularly severe and likely to occur in models developed by other companies). 

Between the lines

While some recently published standards and guidance have called out the need for AI developers to monitor deployed models for risks–and be prepared to withdraw them when necessary–there is more work to be done. Policymakers and AI companies will need to coordinate on several capacity-building measures, including (but not limited to):

  • Defining and sharing threat models and developing tools to parse data for signs of misuse or undesired model behavior.
  • Developing a standardized framework for frontier AI incident response and sharing best practices.
  • Establishing secure reporting lines for quickly communicating across industry and government in the case of an incident or discovered vulnerability.

Policymakers could also consider requiring frontier AI developers to take certain critical steps, such as maintaining control over model access or maintaining incident response plans and making such plans available to relevant agencies.

Finally, it is worth noting that the deployment corrections framework is not a silver bullet for managing AI risks. It is one small part of a larger conversation to build stronger governance mechanisms around frontier AI model development and deployment. This conversation has recently seen major advancements in the form of a US Executive Order and a flurry of publications of AI firms’ safety policies. While we look forward to seeing work that expands on our framework, we also look forward to work that fills important gaps in the broader project governing frontier AI development and deployment.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

This image shows a large white, traditional, old building. The top half of the building represents the humanities (which is symbolised by the embedded text from classic literature which is faintly shown ontop the building). The bottom section of the building is embossed with mathematical formulas to represent the sciences. The middle layer of the image is heavily pixelated. On the steps at the front of the building there is a group of scholars, wearing formal suits and tie attire, who are standing around at the enternace talking and some of them are sitting on the steps. There are two stone, statute-like hands that are stretching the building apart from the left side. In the forefront of the image, there are 8 students - which can only be seen from the back. Their graduation gowns have bright blue hoods and they all look as though they are walking towards the old building which is in the background at a distance. There are a mix of students in the foreground.

Tech Futures: Co-opting Research and Education

Agentic AI systems and algorithmic accountability: a new era of e-commerce

ALL IN Conference 2025: Four Key Takeaways from Montreal

Beyond Dependency: The Hidden Risk of Social Comparison in Chatbot Companionship

AI Policy Corner: Restriction vs. Regulation: Comparing State Approaches to AI Mental Health Legislation

related posts

  • Real talk: What is Responsible AI?

    Real talk: What is Responsible AI?

  • The GPTJudge: Justice in a Generative AI World

    The GPTJudge: Justice in a Generative AI World

  • Knowing Your Annotator: Rapidly Testing the Reliability of Affect Annotation

    Knowing Your Annotator: Rapidly Testing the Reliability of Affect Annotation

  • Reports on Communication Surveillance in Botswana, Malawi and the DRC, and the Chinese Digital Infra...

    Reports on Communication Surveillance in Botswana, Malawi and the DRC, and the Chinese Digital Infra...

  • The Ethics of AI Business Practices: A Review of 47 AI Ethics Guidelines

    The Ethics of AI Business Practices: A Review of 47 AI Ethics Guidelines

  • “Made by Humans” Still Matters

    “Made by Humans” Still Matters

  • Subreddit Links Drive Community Creation and User Engagement on Reddit

    Subreddit Links Drive Community Creation and User Engagement on Reddit

  • Towards Intersectionality in Machine Learning: Including More Identities, Handling Underrepresentati...

    Towards Intersectionality in Machine Learning: Including More Identities, Handling Underrepresentati...

  • The Design Space of Generative Models

    The Design Space of Generative Models

  • Adding Structure to AI Harm

    Adding Structure to AI Harm

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • © 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.