Research summary: The Flight to Safety-Critical AI

Summary contributed by Abhishek Gupta (@atg_abhishek), founder of the Montreal AI Ethics Institute.

*Authors of full paper & link at the bottom

Mini-summary: The paper looks at the aviation industry as a case study for the adoption of AI safety standards and how a measured approach to first applying them in non-critical contexts like predictive maintenance, route planning, and decision support is a better approach to develop standards and practices that can then be applied to more safety-critical scenarios like flight control. It also stresses the importance of cross-jurisdictional coherence in this domain since flight safety acceptance is a cornerstone for trade and movement of capital and labor across borders.

The paper also applies the paradigmatic framing of different kinds of races and mentions that based on preliminary findings, it is clear that there is a gentle race to the top in terms of safety standards given this domain’s strong emphasis on operational safety over everything else. Finally, making several recommendations along the lines of the roles that regulators can play, how the industry players might interact with each other, and the incentives for the research segment to contribute to this are made that provide a roadmap to potentially apply some of the learnings from this to other domains where safety is of the essence.

Full summary:

The paper takes a critical look at how automation is being deployed in practice with a focus on aviation which is known for its excellent safety records and highly stringent regulations that uphold that safety record. While a lot of arguments might be made about how the current framing of AI development across different regions constitutes a race where ethics and safety might fall by the wayside, the paper arrives at a different conclusion based on interviews and of practitioners in the field and in-depth analysis of the current state of deployment of automation in aviation. Given the nascent stage of development of technical safety measures in AI, automation is being used in aviation in less safety-critical areas like predictive maintenance, decision support, and route planning rather than handing over full control of the flight.

Something that is unique about aviation is that the firms will unprompted invest in mechanisms like Testing, Evaluation, Validation, and Verification (TEVV) compared to research labs and other industries, especially given how open the aviation industry has been to firms opening up about their mistakes and helping the entire ecosystem move towards higher levels of safety.

Preliminary findings in the paper show that military avionics is even more conservative in adopting automation measures compared to commercial avionics, allaying some of the concerns around warfare automation. The paper applies the “race” paradigm, given that it is a popular trope, to analyze the trends in the space today and where things might be heading. For example, self-regulation is often touted as a way to prevent a race to the bottom when it comes to things like facial recognition technology with firms like Microsoft advocating for both industry and government action in helping the ecosystem reach greater levels of ethics and responsibility. Lower levels of regulation are of potential interest to firms since they offer an easy way to cut on costs of compliance and offer them competitive advantages, more so in domains where profit margins are thin.

The “Delaware effect” is an epitome of this whereby firms choose to incorporate in Delaware because of favorable circumstances offered by the state. In contrast, there is the “California effect” or the related “Brussels effect” where firms encourage the state to levy higher levels of regulation. So, when we are talking about safety-critical systems, we are talking about systems where errors can be quite costly, especially in terms of human lives. This distinction is important because it allows us to analyze in a more nuanced way where AI might be deployed first to “iron out the kinks.” Compared to other industries that have high safety requirements like nuclear power, what is different about aviation is that it is subject a lot more to market forces and doesn’t have natural monopolies that are as strong as in the case of nuclear power. Secondly, being directly exposed to end-users also makes it different from nuclear power and hence a good test-bed to see the impacts of automation and its implications for the future. One of the driving forces for automation adoption in aviation is the volatility of the industry based on demand and supply forces, including fuel prices and razor-thin margins because of a high degree of competition. This along with the requirement from regulators to have very high standards of safety gives a push for the potential consistency that automation can offer over human errors.

An additional factor that supports the potential adoption of widespread automation is the large amounts of structured data collected by air traffic controllers (ATCs). Yet, because of a lack of consistent standards in terms of what constitutes the requisite safety demonstrations, there is a bit of fragmentation in the understanding of what needs to be done. A point highlighted in the paper is the need for regulators working across different countries and jurisdictions to develop a shared understanding and standards so that safety can be evaluated in a comparable manner not disrupting the important flows of capital and labor across borders. Beginning this exercise in non-critical areas is a good testbed in ironing out the process so that when automation is applied to the more safety-critical aspects, there is a well-established process for enabling the transition to happen smoothly. The reciprocity in agreements around safety standards is essential.

The Max 737 crashes demonstrated how large the backlash can be from errors in this domain, especially in the case where the firms pay massive fines, lose future contracts, and suffer long-term reputational damage. Given the high degree of public scrutiny, one can only imagine how chilling the effects of premature and untested automation in safety-critical areas of aviation will be. Firms are also hesitant in deploying this technology commercially till standards and regulations are put in place. Some systems have been tested for making landings in low visibility scenarios, but these are still quite limited, especially since there isn’t an established process for evaluating the performance of the system in varied conditions. Aviation is also noteworthy for its emphasis on discouraging competition on safety-standards since any missteps lead to a decrease in traveler confidence and have a negative impact on the entire industry. An Airbus ad that tried to imply that their planes were safer than those made by Boeing faced massive backlash from many airlines that led to a retraction. Since failures in airlines are catastrophically expensive, in human lives and financially, there is very little room to experiment to arrive at the “failsafe” configurations which makes the implementation of safety-critical testing of automation all the more crucial but also difficult to achieve. A reason also put forward for the slow deployment of automation is the arduous and long process of certification where returns on investment take a while to show up, potentially imposing financial hardship on the firms that are already pressed for cash flow.

Some of the recommendations made in the paper are as follows: calling on policymakers to make more investments in TEVV-styled mechanisms for AI systems and calling on regulators to collaborate on creating standards in AI safety and encourage information sharing on system failures. Firms that are willing to pay this upfront cost in terms of investments will reap the benefits in being safety compliant that will allow them to transition into being able to use AI in safety-critical scenarios offering them a competitive advantage. From the analysis done in the report, there is little evidence that competitive pressures will overwhelm the safety imperatives for automation adoption in the aviation industry.

Original paper by Will Hunt: https://cltc.berkeley.edu/wp-content/uploads/2020/08/Flight-to-Safety-Critical-AI.pdf