Embedding Values in Artificial Intelligence (AI) Systems

🔬 Research summary by Dr. Andrea Pedeferri, instructional designer and leader in higher ed (Faculty at Union College), and founder at Logica, helping learners become more efficient thinkers.

[Original paper by Ibo van de Poel]

Overview: Though there are numerous high-level normative frameworks, it is still quite unclear how or whether values can be implemented in AI systems. Van de Poel and Kroes’s (2014) have recently provided an account of how to embed values in technology. The current article proposes to expand that view to complex AI systems and explain how values can be embedded in technological systems that are “autonomous, interactive, and adaptive”.

Introduction

Though there are numerous high-level normative frameworks, it is still quite unclear how or whether those frameworks can be implemented in AI systems. Van de Poel and Kroes’s (2014) have recently provided an account of how to embed values in technology in general. The current article proposes to expand that view to AI systems which, according to the author, have five building blocks: “technical artifacts, institutions, human agents, artificial agents, and technical norms”. This paper is a very useful guide to understanding how values can be embedded in a complex system comprised of multiple parts that interact in different ways.

Key Insights

Embedding Values

Organizations such as the EU High-Level Expert Group on AI and the IEEE have provided a list high-level ethical values and principles to implement in AI systems. Whatever your views on values might be, the paper points out that we need an account of what it means for those values to be embedded. To start, a set of values is said to be ‘embedded’ only if it is integrated into the system by design. That is, those who design the system should intentionally build that system with a specific set of values in mind. More is needed, though, because even if a system is designed to comply with certain values, that does not mean it will really realize those values.

So the paper proposes the following definition of “embodied values”: “The embodied value is the value that is both intended (by the designers) and realized if the artifact or system is properly used.”

Drawing both from the current paper and Van de Poel and Kroes’s (2014), we have the following set of useful definitions:

Designed value: any value that is intentionally part of the design of a technological system

Realized value: any value that the (appropriate) use of the system is prone to bring about

Embedded value: any value that is both designed and realized. Thus, a value-embedded system is a system that, because of the way it was designed, will bring about certain values (when it is properly used).

As the paper explains, this opens the door to the idea of a feedback loop: when an intended value is not realized, there has to be some change in the way it is used and/or designed. Similarly, if a system is used in a way that is contrary to intended values, a re-design might be in order. As the author points out, the practice of re-design systems to avoid unintended consequences “is particularly important in the case of AI systems, which due to the adaptive abilities of AI, may acquire system properties that were never intended or foreseen by the original designers.”

Embedding Values in AI systems

This account provides a way to understand how values can be embedded in AI by looking both at the components and the system level. More specifically, the paper understands AI systems as socio-technical systems composed not only of “technical artifacts, human agents, and institutions” but also “artificial agents and certain technical norms that regulate interactions between artificial agents and other elements of the system.” To clarify, a socio-technical system is a system that depends “on not only technical hardware but also human behavior and social institutions for their proper functioning (cf. Kroes et al. 2006).”

To start, the paper clarifies that an AI system will be the result of both social institutions and human agents interacting to design technological artifacts in accordance with certain values. Importantly, the paper points out that those social institutions will also be embedded with values. As such, the role of humans is key: they need to monitor and evaluate the outcomes and use of both the technological artifacts and the social institutions that influence the production and design of those technological artifacts. In addition, because of how AI systems work, there will also be technical norms that regulate how artificial agents interact with humans and social institutions. As such, these norms will embed and promote certain values.

Therefore, in conclusion, an AI system promotes a set of values if and only if all five of its main components (i.e. technical artifacts, institutions, human agents, artificial agents, and technical norms) will either embody or intentionally promote V. As the author rightly points out then, “AI systems offer unique value-embedding opportunities and constraints because they contain additional building blocks compared to traditional sociotechnical systems. While these allow new possibilities for value embedding, they also impose constraints and risks, e.g., the risk that an AI system disembodies certain values due to how it evolves. This means that for AI systems, it is crucial to monitor their realized values and to undertake continuous redesign activities.”

Between the lines

The paper is a very useful guide to understanding how values can be embedded in a complex system composed of multiple parts that interact in different ways. The next step is to figure out how this analysis connects to the debate on trust and trustworthy AI (see here for more): given the current way we understand value-embedded AI, is it possible to build an AI we can actually trust?