• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
    • Tech Futures
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • State of AI Ethics Report Volume 8 (2026): Call for Contributors
    • Volume 7 (November 2025)
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

AI Policy Corner: Layered Governance in AI Labs: Defining Boundaries Across the Policy Stack

March 30, 2026

A person sits in an armchair and writes in a notebook, with speech bubbles showing indefinite strokes and then a lightbulb. Nearby, a table with a laptop showing an LLM chatbot interface and a cup.

✍️By Tejasvi Nallagundla.

Tejasvi is an Undergraduate Student in Computer Science, Artificial Intelligence and Global Studies and an Undergraduate Affiliate at the Governance and Responsible AI Lab (GRAIL), Purdue University.


📌 Editor’s Note: This article is part of our AI Policy Corner series, a collaboration between the Montreal AI Ethics Institute (MAIEI) and the Governance and Responsible AI Lab (GRAIL) at Purdue University. The series provides concise insights into critical AI policy developments from the local to international levels, helping our readers stay informed about the evolving landscape of AI governance. This piece uses Anthropic’s Claude Constitution, Responsible Scaling Policy, and Claude Sonnet 4.6 System Card as a representative example to look at how layered corporate AI policy documents come together to define and shape the boundaries of model behavior across a governance stack.


AI Governance as a Stack

We often think of AI governance as a set of rules or clear guidelines on the “yes” versus “no” of what a certain implementation or tool can or cannot do. In practice, however, the companies building these tools have to “govern” AI not just in terms of compliance with rules set out by governing bodies, but also in a corporate sense. This corporate governance doesn’t just come in the form of a single “AI policy document,” but rather as a range of materials, from high-level principles to in-depth technical assessments. We can think about these materials using the idea of a stack: a layered set of documents that each serve a different purpose within the broader governance goals of a corporate AI company. 

One way for us to unpack this stack and see it in practice is to analyze a company’s approach with a specific question in mind across several documents. Here, I am choosing to do so through the AI lab Anthropic, and the question of how boundaries for model behavior are defined. Looking across three layers of Anthropic’s governance stack, we see that the idea of the boundary is not defined in just one place but rather constructed across all of them, with each layer serving a different purpose in shaping and presenting that boundary, albeit with the same overarching goal. 

The Value Layer

Starting at the top of the stack is Claude’s Constitution, which Anthropic defines as “a detailed description of Anthropic’s intentions for Claude’s values and behavior.” From the perspective of our policy stack, we can think of this as the value layer. In terms of how the boundaries of model behavior are presented in this document, the company treats them as not fixed. They explain their approach of favoring the cultivation of “good values and judgment over strict rules and decision procedures”, with Claude playing a role in determining its behavior through holistic prioritization, balancing  considerations such as helpfulness with guidelines and safety. Thus, at the value layer, the boundary definition is normative, describing what the model ought to do, rather than it being fixed or operationalized. 

The Risk Layer

Moving down to the next layer of the stack is Anthropic’s Responsible Scaling Policy (RSP), which they define as a framework establishing how they “identify and evaluate risks” as well as make “decisions about AI development and deployment.” We can think of this as the risk layer. Going back to our question on the boundaries of model behavior, in the RSP, these are presented in a more structured way, tying them to factors such as capability thresholds, risk analyses, and internal governance decisions, while still highlighting that there is “flexibility in how risk thresholds are evaluated”. Thus, at the risk level, the idea of the boundary is shaped and presented with a focus on evaluation, rather than as a completely fixed line (similar to how it wasn’t in the Constitution). 

The Evaluation Layer

Now we turn to Anthropic’s System Cards, which we can think of as the evaluation layer of the governance stack. Looking at their recent System Card for Claude Sonnet 4.6, they describe the document as outlining the model’s “characteristics, capabilities, and safety profile that [they] carried out before its public deployment.” Boundaries, at this layer, are presented through testing against thresholds and associated evaluation results. One interesting thing of note here is their observation that “confidently ruling out these thresholds is becoming increasingly difficult” with evolving model capabilities, thus highlighting how, at this layer too, the boundary is not fixed but rather shaped through testing and a sense of uncertainty. 

Beyond the evaluation layer, a whole range of other materials add onto Anthropic’s governance stack, from their Transparency Hub to government consultations, usage policies, and research publications. Taken together, the focus, content, and framing of these materials on different topics, including boundaries of model behavior, shape how corporate governance surrounding these issues is constructed and distributed across the system.

Further Reading

  • In Claude We Trust? Evaluating the New Constitution
  • Claude’s New Constitution: AI Alignment, Ethics, and the Future of Model Governance
  • Exclusive: Anthropic Drops Flagship Safety Pledge
  • Responsible Scaling: Comparing Government Guidance and Company Policy

Image credit: Distant Writing by Fabrizio Matarese / Better Images of AI / CC BY 4.0

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

SAIER Volume 8 (2026)

SAIER Volume 8 (2026) Call for Contributors

🔍 SEARCH

Spotlight

Vertically- and horizontally-placed chess boards and chess pieces

Tech Futures: At the Frontier of Fear, Uncertainty and Doubt

Tech Futures: Introducing the Resist List

An abstract spiral of dark circles appears at the centre, resembling a tornado. Several vintage magazine covers and advertisements are being drawn toward the spiral. The artworks that have already been pulled into it are becoming distorted and replaced with clusters of numbers representing their numerical embeddings.

Tech Futures: Better Imagination for Better Tech Futures

This image is a collage with a colourful Japanese vintage landscape showing a mountain, hills, flowers and other plants and a small stream. There are 3 large black data servers placed in the bottom half of the image, with a cloud of black smoke emitting from them, partly obscuring the scenery.

Tech Futures: Crafting Participatory Tech Futures

A network diagram with lots of little emojis, organised in clusters.

Tech Futures: AI For and Against Knowledge

related posts

  • AI Policy Corner: The Turkish Artificial Intelligence Law Proposal

    AI Policy Corner: The Turkish Artificial Intelligence Law Proposal

  • This image shows a large white, traditional, old building. The top half of the building represents the humanities (which is symbolised by the embedded text from classic literature which is faintly shown ontop the building). The bottom section of the building is embossed with mathematical formulas to represent the sciences. The middle layer of the image is heavily pixelated. On the steps at the front of the building there is a group of scholars, wearing formal suits and tie attire, who are standing around at the enternace talking and some of them are sitting on the steps. There are two stone, statute-like hands that are stretching the building apart from the left side. In the forefront of the image, there are 8 students - which can only be seen from the back. Their graduation gowns have bright blue hoods and they all look as though they are walking towards the old building which is in the background at a distance. There are a mix of students in the foreground.

    Tech Futures: Co-opting Research and Education

  • Regulating computer vision & the ongoing relevance of AI ethics

    Regulating computer vision & the ongoing relevance of AI ethics

  • What is Sovereign Artificial Intelligence?

    What is Sovereign Artificial Intelligence?

  • AI Policy Corner: Frontier AI Safety Commitments, AI Seoul Summit 2024

    AI Policy Corner: Frontier AI Safety Commitments, AI Seoul Summit 2024

  • ISED Launches AI Risk Management Guide Based on Voluntary Code

    ISED Launches AI Risk Management Guide Based on Voluntary Code

  • Beyond Dependency: The Hidden Risk of Social Comparison in Chatbot Companionship

    Beyond Dependency: The Hidden Risk of Social Comparison in Chatbot Companionship

  • AI Policy Corner: Singapore's National AI Strategy 2.0

    AI Policy Corner: Singapore's National AI Strategy 2.0

  • Teaching Responsible AI in a Time of Hype

    Teaching Responsible AI in a Time of Hype

  • AI Policy Corner: U.S. Copyright Guidance on Works Created with AI

    AI Policy Corner: U.S. Copyright Guidance on Works Created with AI

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • © 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.