Open-source provisions for large models in the AI Act

🔬 Research Summary by Harry Law and Sebastien A. Krier.

Harry Law is an ethics and policy researcher at Google DeepMind, a PhD candidate at the University of Cambridge, and postgraduate fellow at the Leverhulme Centre for the Future of Intelligence.

Sebastien A. Krier is a policy manager and researcher at Google DeepMind. He previously worked at Stanford University’s Cyber Policy Centre, the UK Government’s Office for Artificial Intelligence, and trained as a lawyer at Freshfields Bruckhaus Deringer LLP.

[Original paper by Harry Law and Sebastien A. Krier]

Overview: We argue for a new settlement for open-source foundation models in the context of the EU AI Act. The paper suggests that open-source models should not be subject to the same provisions as commercial models but should be subject to evaluations to limit risks. To minimize the burden of regulation and prevent the concentration of capabilities in the hands of a few powerful actors, this approach would only apply to releasing open-source foundation models that are demonstrably more complex, capable, and dangerous than those we have seen to date.

Introduction

The benefits of AI are seldom discussed without its risks. The potential for large models and their successors to do enormous good—from boosting economic growth to helping us live longer, richer lives—is often contrasted with the risks that the development and deployment of such models present. This debate is especially interesting at the point at which we discuss access, which determines how and where the benefits of AI will be realized (as well as to whom they will accrue).

Widening access allows more people to use AI, build with it, and improve their lives and the lives of others. However, it is the usefulness of large models that can also make them dangerous. If you increase the number of people who can use highly capable AI systems, you are also increasing the number of possible vectors for harm. One position within this debate, which focuses on maximalist interpretations of access, is often described as ‘open-source’ (though that label isn’t always appropriate due to issues related to transparency and licensing). Regardless, the term is often used as a shorthand for a style of release that sees full models (sometimes including their weights) made available to anyone who wants them.

This debate has been simmering for quite some time, with some favoring approaches that minimize restrictions to drive growth, prevent the formalization of highly asymmetrical markets, and deliver maximum benefit to as many people as possible. Others, meanwhile, propose that this position will enable AI to cause harm by putting powerful systems in the hands of bad actors. We believe, however, that framing this as ‘access vs. no access’ overlooks and oversimplifies the true nature of the problem. Many in the latter camp argue for a more refined distinction between democratizing use and proliferating core system weights. This suggests that broader access doesn’t necessarily imply unrestricted sharing of the underlying technology.

This was the lens through which we viewed recent provisions to the European Union’s AI Act, which brought this question in sharp relief with efforts to regulate the deployment of open-source foundation models.

Key Insights

The European Union proposed the Artificial Intelligence Act in 2021 to regulate AI systems based on the level of risk they pose: unacceptable risk, high risk, limited risk, and minimal or no risk. Recent amendments discuss provisions focused on ‘foundation models,’ which are defined as an “AI system model that is trained on broad data at scale, is designed for generality of output, and can be adapted to a wide range of distinctive tasks.”

Moves to label foundation models as high-risk have proven controversial because they see the Act target a specific model type, contravening its risk-based approach. Although the original scope of the AI Act was intended only to cover AI systems deployed in high-risk settings, such as tools designed for cancer detection, subsequent amendments regarding general-purpose or foundation models have also targeted their development. Open-source systems were included at certain points and excluded during other periods throughout the Act’s development, underscoring the question’s complexity and the challenge it poses for policy development. As of June 2023, certain open-source systems are currently in-scope:

“A provider of a foundation model shall, prior to making it available on the market or putting it into service, ensure that it is compliant with the requirements set out in this Article, regardless of whether it is provided as a standalone model or embedded in an AI system or a product, or provided under free and open-source licences, as a service, as well as other distribution channels.”

This provision will apply to models made available in the course of commercial activity or supplied for first use to a deployer or put into service for their own use. As we explain in the paper, recital 12b states that a commercial activity can be characterized as “charging a price, with the exception of transactions between micro enterprises, for a free and open-source AI component but also by charging a price for technical support services, by providing a software platform through which the provider monetises other services, or by the use of personal data for reasons other than exclusively for improving the security, compatibility or interoperability of the software.” As a result, the AI Act is likely to place significant liabilities on some providers of open-source foundation models.

Our proposal

We suggest a new way forward for open source systems for the EU’s AI Act. On the one hand, the Act’s provisions for open-source foundation models may mean hefty liabilities for providers, which has the potential to stifle growth, restrict competition, and consolidate power. This is an outcome we would like to avoid. On the other hand, however, open-source foundation models can potentially cause harm. While we aren’t quite there yet regarding extreme dangers, we suspect that calculus will change in the coming years (for example, through models posing biosecurity or cybersecurity risks).

Because the time it will take to move from existing capabilities to future capabilities is unknown, we propose that regulators should build the infrastructure to conduct safety evaluations for open-source foundation models sooner rather than later. We argue that while the release of today’s open-source models should be supported and unrestricted, regulators ought to design provisions concerned with identifying clearly defined dangerous capabilities and preventing the proliferation of dangerous models released on an open-source basis in the future.

In practice, the release of future open-source foundation models should include testing and safety evaluations for dangerous capabilities to ensure policymakers and users of such models know the risks associated with their release. One way to do that is through new benchmarks, metronomy, and evaluations to help manage powerful foundation models. We should reconsider their release if models show alarming capabilities. Another is through new transparency and access requirements—like structured access mechanisms and rich documentation—that would apply to all foundation models. (With respect to transparency requirements, however, we understand that while they may help increase accountability, they likely cannot prevent the proliferation of dangerous models. In some instances, they may plausibly be counterproductive, too.)

Striking the right balance is crucial: overly broad requirements, as advocated for in recent drafts, risk entrenching power and dampening growth, whereas overly lax requirements advocated by skeptics might cause serious harm. For this reason, we favor the structured access paradigm. These programs enable researchers to verify developer claims, ensuring that only reliable foundation models are widely used.

As we write in the paper, “Ultimately, we contend that for open-source models, the Act’s broad scope risks encompassing too much. Instead, we suggest the development of better tools, evaluations, definitions, and thresholds to limit risks and negative externalities, and propose only restricting the release of open source foundation models that are demonstrably more complex, capable, and dangerous.”

Between the lines

Ensuring the benefits of AI are spread while mitigating against the proliferation of dangerous models is a defining challenge of our industry. Striking the right balance will always prove difficult, and parties will always disagree about whether a settlement is too restrictive or too generous.

Open-source models are an important part of this puzzle. Not only do they act as a mechanism for diffusing the benefits of AI, but they also contribute to the growth of the AI ecosystem writ large. While we support measures that favor minimal restrictions on open-source models today, it is unfortunate that such a position will prove dangerous as models become more sophisticated.