Top-level summary: Intelligibility is a notion that is worked on by a lot of people in the technical community who seek to shed a light on the inner workings of systems that are becoming more and more complex. Especially in the domains of medicine, warfare, credit allocation, judicial systems and other areas where they have the potential to impact human lives in significant ways, we seek to create explanations that might illuminate how the system works and address potential issues of bias and fairness.
However, there is a large problem in the current approach in the sense that there isn’t enough being done to meet the needs of a diverse set of stakeholders who require different kinds of intelligibility that is understandable to them and helps them meet their needs and goals. One might argue that a deeply technical explanation ought to suffice and others kinds of explanations might be derived from that but it makes them inaccessible to those who can’t parse well the technical details, often those who are the most impacted by such systems. This paper by Yishan Zhou and David Danks offers a framework to situate the different kinds of explanations such that they are able to meet the stakeholders where they are at and provide explanations that not only help them meet their needs but ultimately engender a higher level of trust from them by highlighting better both the capabilities and limitations of the systems.
With the advent of increasingly complex AI-enabled systems used in production, it is no surprise that in critical scenarios where there is potential for significant impact on human lives, people demand that the systems be intelligible. In this paper, the authors refer to the collection of understandable, interpretable and explainable requirements to constitute intelligibility, not commenting on the preference for one definition over another with the focus being more on the taxonomy of intelligibility as it applies to different stakeholders who might be involved with the system. Most literature in this space positions the tradeoff between performative accuracy of a model against its intelligibility which largely holds true, for example, one can think of how decision trees lend themselves to understanding by humans but can lose in terms of predictive powers compared to more complex deep learning models which are less scrutable to humans. Yet, both accuracy and intelligibility aren’t necessarily valuable in and of themselves, rather they hold value because they serve as means to an end for human purposes.
So, the notion of intelligibility is rooted in a socio-technical context and people can be roughly categorized as engineers, users and affectees based on their interactions with the system, keeping in mind that the groups are not static and can be overlapping.
Engineers are typically part of the design and development teams that build AI systems and would demand an understanding of the inner workings of the system to the ends of their development needs but not particularly to satisfy any other needs, unless they also fall into the groups of users and/or affectees. Users on the other hand, don’t care much for how the system operates but are more concerned with the capabilities and limitations of the system such that they can meet their needs, usually relating to their job functions. Affectees are those who are impacted by the performance of the system and are quite diverse in their composition. One might argue that there isn’t much of a need for intelligibility for them yet there are many legal and moral requirements which make this necessary.
This categorization is cast on the basis of what people require from technology rather than their social role and is hence more flexible in accommodating for how people might move across different functions and contexts. Additionally, intelligibility is viewed as a goal which has value as long as it serves the need of the person rather than holding it to have intrinsic value, in fact in places it might even frustrate the needs of the person, if for example, all they seek is high accuracy.
Starting with affectees, since they don’t have meaningful control over the design and development of the system nor do they have the ability to alter the socio-technical ecosystem surrounding the system, at least without significant effort, they are more concerned with “difference-making” intelligibility that gives them insight into how changing the inputs that they supply to the system can alter the outputs. While they may not be able to alter all the possible inputs going into the system, for example in a facial recognition system, not being able to alter the sensor state but parts of their appearance (such as preventing occlusion of their eyes by hair) which might impact the rates of recognition. Given their requirements, one might frame the intelligibility outputs in a vernacular that evades technical jargon and other specifics and can utilize examples and explanations that are approximations of the actual workings, especially when we account that this group is diverse and might not necessarily have all the prerequisite background knowledge to parse why the system operates the way it does. Such a framing might also help with larger conversations around understanding bias and fairness concerns in these systems.
For users who might be looking for richer intelligibility, a function-based approach that documents and explains the context within which to utilize the system and how that might have implications on its capabilities and limitations serves to better service the needs of the users. This can typically be done by relying on developer knowledge and technical design documents, combining and tweaking both into a vernacular that makes the intelligence more accessible and useful to users who would typically be known in advance so it would require minimal overhead assuming the other pieces are in place.
When the goals are more towards being able to improve a system and adapt it for novel use cases, which typically falls into the realm of engineers as defined here, we seek causal-process explanations that will shine a light on some of the internal functioning of the system that is highly technical in nature, requiring a deeper understanding on the part of the engineers such that they are able to use this intelligence to make adjustments to the system to meet their needs. When talking about intelligibility, the community largely refers to this kind of intelligibility. This has significant challenges today in terms of the techniques that are available which aren’t able to make tremendous headway into offering causal explanations. This happens due to two reasons: the inscrutability of increasingly complex deep learning systems and even if specific systems are intelligible, the combinations of them to build a product/service renders a higher degree of complexity onto the system that makes them hard to reason with.
Some of the future research directions and limitations of the framework identified by the authors include the fact that there is a diversity of goals that exist beyond the taxonomy that has been proposed though they identify that it might be the case that there isn’t a specific need for intelligibility in such cases beyond what has already been identified. Additionally, it is possible that people can have multiple roles that they don over the lifetime of the product/service but it would rarely be the case that they overlap with each other during a specific time period hence the conflicts that might arise from the different degrees of explanations required would be separated over time periods.
One of the big open questions is on the ability to evaluate the quality of the intelligibility and whether it meets the needs of the people that it would be looking to serve. To that end, there aren’t really any good metrics and methods just yet. On the other hand, the evaluations might also be able to accommodate for the requirement that there are some things that become intelligible by experience through repetition and we need to be able to capture that into how we evaluate intelligibility effectiveness. They must also be sensitive to the diversity of goals and be able to capture the required diversity in evaluation methods such that the intelligibility conditions and effectiveness are captured well within this. A final point made on the potential drawbacks of the intelligibility methods is that sometimes falsehoods might lead to more effective explanations, such as when things are abstracted to a level of detail where the actual workings are distorted which might lead to a skewed interpretation and expectations about the capabilities and limitations of the system.
But, such efforts provide a great starting point for those looking to do interdisciplinary work in the space and fosters a more holistic understanding of the issues around intelligibility. Such taxonomies accurately position the tensions and allow other scholars and practitioners to build upon the work such that they can enhance the quality of explanations that are created to accompany complex machine learning systems and evoke higher degrees of trust from all the stakeholders involved with the system.
Original piece by Yishan Zhou and David Danks: https://dl.acm.org/doi/abs/10.1145/3375627.3375810