Structured access to AI capabilities: an emerging paradigm for safe AI deployment

🔬 Research Summary by Max Krueger, a consultant at Accenture with an interest in both the long and short-term implications of AI on society.

[Original paper by Toby Shevlane]

Overview: With increasingly powerful AIs, safe access to these models becomes an important question. One such paradigm addressing this is that of structured capability access (SCA). SCA aims to restrict model misuse by employing strategies to limit the end-users’ access to various parts of a given model.

Introduction

Access control is a crucial component of an effective, global AI governance strategy. Structured access control is a framework in which developers limit control of AI systems’ use, reproduction, and modification. SCA takes the approach that AI systems are both information and a tool. Viewing access through this paradigm broadens the potential control mechanisms available to developers.

Key Insights

Fundamentally, structured capability access (SCA) is a safety mechanism in which the model developer provides – and takes responsibility for – access to their models. It can be thought of as model-as-a-service; you pay a company to use their model without seeing how it is built or maintained while the developer controls how you use it. Crucially, for SCA to be effective, companies must have an effective means of tracking and understanding how their models are being used. SCA aims to address the question, how can AI systems be safely deployed to prevent the user’s harm (intentional or not)? The author, Toby Shevlane, states:

The developer offers a controlled interaction with the AI system’s capabilities, using technical and sometimes bureaucratic methods to limit how the software can be used, modified, and reproduced.

There are two broad ways an entity can provide controls over a model 1) use controls and 2) modification and reproduction controls.

Use Controls

Use controls take two forms, software-level controls and access-level controls. As the name suggests, use controls limit how an end-user can leverage the AI model. For example, the careful design of an AI system could reduce bias and misuse with no need to vet the end-use case. In conjunction with software-level controls, the developer can utilize an application programming interface (API) or user interface to limit who and how frequently a model is accessed. This allows the controlling party to grant or revoke access to the model.

Modification and Reproduction Controls

Modification and reproduction controls aim to limit how much of the model the end-user could change or reproduce. For example, developers may opt to keep the source code proprietary (instead of open-source). This makes it harder to modify but may have other implications within the AI ethics domain. Additionally, a developer may implement sophisticated cybersecurity defenses to limit modification and reproduction. However, as demonstrated in my previous research summary, black-box models are still highly susceptible to adversarial attacks, especially reproduction attacks.

Selective Disclosure

Selective disclosure in SCA builds off the concept of structured transparency. According to Shevlane, structured transparency “involves finding mechanisms, both technical and social, for granting access to certain pieces of information while keeping others private.” SCA takes this concept further by “governing what somebody can and cannot do with an AI system.” As the author points out, selective disclosure works well for the governance of data such as personally identifiable information. Still, it does not work well for dual-use technologies such as model code, which is both information and a tool.

Microsoft’s DialoGPT illustrates this issue. Researchers were concerned about the inappropriate use of the model and therefore withheld an essential piece of code from the open-sourced codebase. This does not solve the issue as the model is rendered useless, or end-users can find a substitute for the missing code and have full access to the model. In summation, “The lesson is that the developer cannot selectively filter the informational content of the software in a way that neatly discriminates between uses.” Researchers aimed to limit the use of the tool while providing access to the information. Such a strategy was unsuccessful due to the dual-use technological nature of the model and its code. Selective disclosure does not seem to be a viable control mechanism.

Implementation of SCA

The author differentiates different SCA mechanisms based on the deployment method of the model, local or cloud-based. Controls on local deployment are complicated to enforce. For example, a developer could use a licensing system to control who uses the product though this is easily circumvented and harder to implement at scale. Controlling for modification and reproduction is additionally very difficult to police. Developers could build in piracy controls, deep-learning specific encryption, or embed the software on particular hardware. While each of these might make reproduction and modification less accessible, a well-motivated adversary may be able to break these controls.

Cloud-based deployment is far more secure. Software-level use controls excel with cloud-based models. Developers can give various levels of access to end-users and easily restrict control on a case-by-case basis. Cloud-based environments enable easier monitoring of the use of the model. Modification and reproduction controls are also more easily implemented in a cloud environment. On one end of the spectrum, developers could wholly restrict access to the model code and parameters while implementing access quotas to curb model stealing. On the other end, developers could give complete access to end-users. Cloud-based deployment provides greater granularity and flexibility in controlling access to a given model.

Other Considerations

The author points to an apparent weakness of SCA that inherently, SCA pushes power into the hands of AI developers. It is imperative to consider SCA as part of a larger governance strategy. Organizations could use SCA to comply with future government regulations. When combined with effective policy, centralization of power is of less concern.

Between the lines

Access control remains an essential question in AI safety. SCA is one potential method for ensuring the appropriate use of AI technologies. Ultimately, access control will likely take numerous shapes and sizes, with SCA being one part of the overarching solution. Cloud-based SCA seems like a promising control method given its flexibility to address several access regimes. A crucial part of this control mechanism is the ability of developers to collect and analyze use data. This may be a significant issue in a scenario where there are potentially hundreds of thousands of end-users. Developers must understand how end-users are using their platform and be able to detect inappropriate behaviors (think fraud detection for AI models). If developers can accurately and quickly identify fraudulent behavior, SCA has significant potential, especially when paired with other effective governance interventions (i.e., policy). At present, access control appears to come at the cost of transparency; it is vital to implement a mechanism that treats these as complementary.