🔬 Research summary by Dr. Andrea Pedeferri, instructional designer and leader in higher ed (Faculty at Union College), and founder at Logica, helping learners become more efficient thinkers.
[Original paper by Mark Ryan]
Overview: The European Commission’s High-level Expert Group on AI (HLEG) has developed guidelines for a trustworthy AI, assuming that AI is something that has the capacity to be trusted. But should we make that assumption? Apparently no, according to this paper, where the author argues that AI is not the type of thing that has the capacity to be trustworthy or untrustworthy: the category of ‘trust’ simply does not apply to AI, so we should stop talking about ‘trustworthy AI’ altogether.
Trust is an essential feature of our social life. Trust is an attitude that we possess and use when we engage in interpersonal relations, when we believe and have confidence that an agent will do what we ask them to do. Trust is not just a blind bet in the trustee to do something: they have to possess some degree of trustworthiness. However, it’s still a bit of a risky business since trustees could break our trust thus “betraying” us. This is why we usually put care in choosing trustworthy agents, to minimize the risk of being betrayed. This is important at the personal level as well as different social levels. It’s critical for social, economical, political agencies to be trustworthy. Accordingly, trust has also implications about possible regulations that we can impose or demand to “control” the trustworthiness of those multi-agent structures.
As you (supposedly) trust your well-paid financial advisor in providing you with the best (that is, the best for your interest) financial strategies, should you also trust an AI that does exactly the same job? In a recent deliberation, the European Commission’s High-level Expert Group on AI answered: yes. In his, also recent, article “In AI We Trust: Ethics, Artificial Intelligence and Reliability”, Mark Ryan gives us the opposite answer: no.
Why AI can never be trusted
Ryan comes to the conclusion that “AI cannot be something that has the capacity to be trusted according to the most prevalent definitions of trust because it does not possess emotive states or can be held responsible for their actions”. Our common tendency to anthropomorphize AI by attributing it to human-like features, like mental states, is therefore an incorrect way to characterize AI that does not allow us to treat AI as a trustworthy agent.
In order to understand how Ryan came to this quite drastic conclusion, we need to look at the assumptions he draws from. First, he focuses on what is commonly referred to as “Narrow AI’, that is, “a system that is designed and used for specific or limited tasks”, which is different from “General AI” which is a “system with generalized human cognitive abilities”. Second, he relies on a synthesis of the “most prevalent definitions” of trust. AI simply can’t, says Ryan, fit in these definitions. As a result, it cannot be taken as a trustworthy agent at all. Let’s briefly see the definition used in the paper to better understand Ryan’s conclusion and its implications.
What is trust?
Ryan proposes a definition of trust that encompasses three main accounts usually associated with trust: the rational account, the affective account and the normative account.
Previously I described trust as a sort of bet on an expected future behavior of an agent. If we read trust in these terms, we could think of trust as the result of a (rational) choice made by the trustor by comparing pros and cons. So, according to this rational account, trust is the result of a calculation by the trustor, and the prediction of whether the trustee will uphold the trust placed in her has nothing to do with any motivation she could possess. Calculations are what machines are usually very good at, so we could say that according to this rational account AI is trustworthy and we can thus trust it. However, Ryan is very skeptical that this account represents what we usually mean by trust; he thinks that this describes just a form of reliance we can have toward AI. This is because rational trust has a total “lack of concern about the trustee’s motivation for action”. The presence of those motivations is essential for having trust instead of just reliance. To explain the difference between trust and reliance Ryan provides few examples such as the ‘sexist employer’ presented originally in a paper by Potter:
“There is a sexist employer who treats his female staff well because he fears legal sanctions if he does not. Because he has not done anything inappropriate to his current female employees, they may consider him reliable, but not trustworthy. ‘The female employees might know that their employer treats them well only because he fears social sanctioning. In that case, he could not betray them [because they did not place any trust in him to begin with], although he could disappoint them. However, the rational account of trust would state that the female employees can trust the sexist boss because this type of trust only focuses on the trustee’s past behaviour to predict whether they should be trusted. “
Ryan argues that AI deserves similar treatment: we can rely on it to do the job, but it lacks all the features moral agents need in order to be considered trustworthy. So, what are those features?
According to Ryan, the full definition of trust (A trusts B) is:
- A has confidence in B to do X.
- A believes B is competent to do X.
- A is vulnerable to the actions of B.
- If B does not do X then A may feel betrayed.
- A thinks that B will do X, motivated by one of the following reasons:
- Their motivation does not matter (rational trust)
- B’s actions are based on goodwill towards A (affective trust)
- B has a normative commitment to the relationship with A (normative trust)
The affective and Normative accounts differ from the rational account because “they state that betrayal can be distinguished from mere disappointment by the allocation of the intent of the trustee”. So, in order to have “real” trust, the trustee has to possess motivation(s) for action. The rational account can well do without any motivation. Why can’t we talk about motivations when it comes to AI? The idea behind the rational account is that reliability is only based on predictions that rely on past performance. However, there are many situations where our decisions about trust cannot be taken by looking at reliability alone.
For example, let’s suppose we want to establish a peace treaty with an enemy that fought against us till this moment. By the rational account, they should not be trusted because clearly unreliable. However, that would rule out any possible assignment of trust and therefore any chance for peace between us. Of course, it is important to know, collect and analyze past data when it comes to informing our choices about trust. But, as Ryan points out, “Trust is separate from risk analysis that is solely based on predictions based on past behaviour […] While reliability and past experience may be used to develop, confer, or reject trust placed in the trustee, it is not the sole or defining characteristic of trust. Though we may trust people that we rely on, it is not presupposed that we do”.
This is because in trust we form expectations that entail the presence of emotive states and motivational states together with psychological attitudes. This is described by the affective and normative accounts of trust. The core principles of those two accounts are motivational states that according to the author are uniquely human. Or, better: “AI may be programmed to have motivational states, but it does not have the capacity to consciously feel emotional dispositions, such as satisfaction or suffering, resulting from caring, which is an essential component of affective trust.” This makes AI incapable of complying with the three-component of affective trust, that is,
- the trustee is favourably moved by the trust placed in them;
- the trustee has the trustor’s interests at heart;
- and the trustee is motivated out of a sense of goodwill to the trustor.
Moral responsibility is at the center of normative and affective accounts, leaving no hope for AI to be regarded as trustworthy. In Ryan’s view, AI is just as a normal artifact in being not a recipient of any moral responsibility which, on the other hand, falls on their developers and users. In fact, according to Ryan even if “AI has a greater level of autonomy than other artefacts, [this] does not constitute an obfuscation of responsibility on those designing, deploying, and using them”. This does not change even if we think at AI as part of complex multi-agent systems. As when we think at the level of complexity of a corporation we associate the concept of trust to the corporation itself and not, for example, to a single employer, so AI continues not to be a “trustworthable” agent even when it’s understood as part of complex, multi-agent systems.
Between the lines
As a human artifact, AI is still in its “infancy”, continuing to develop. One of the greatest challenges of AI development is how to embed some sort of consciousness in it. Assuming that it will eventually be possible to build a “conscious AI” that would not necessarily make it a moral agent. However, that could reposition AI with respect to the three accounts of trust used by Ryan. In this respect, Ryan’s conclusion could be used not as a definitive claim about AI and trust but as a stimulus to reach the level of affective and normative trust that AI seems to lack. Accordingly, we can give a more positive reading of the relation between AI and trust, by claiming that the answer to the question of whether we can trust an AI is a more flexible and open “not yet”.