Measuring Surprise in the Wild

🔬 Research Summary by Azadeh Dinparastdjadid, a senior research scientist at the Safety Research and Best Practices team at Waymo, where she explores road user behavior modeling to establish behavioral benchmarks.

[Original paper by Azadeh Dinparastdjadid, Isaac Supeene, and Johan Engstrom]

Overview: Surprise is a pervasive phenomenon that plays a key role across a wide range of human behavior, but the quantitative measurement of how and when we experience surprise has mostly remained limited to laboratory studies. In this paper, we demonstrate, for the first time, how computational models of surprise rooted in cognitive science and neuroscience combined with state-of-the-art machine-learned generative models can be used to detect surprising human behavior in complex, dynamic environments like road traffic.

Introduction

We have all experienced the sensation of surprise through experiences such as when a car next to us suddenly cuts into our lane or when a lead vehicle suddenly brakes on a high-speed lane.

The key role of expectations and expectation violations (i.e., surprise) has long been acknowledged in the traffic safety literature. Still, there is no precise quantitative definition or computational model of surprise in the domain of road traffic. The formal ISO definitions of a crash and a near crash (ISO/TR 21974-1:2018) require that a true crash or near-crash be “not premeditated” or in simple terms, to be unexpected and hence surprising. Predictability has also been proposed as a key principle of good autonomous vehicle (AV) driving behavior (De Freitas et al., 2021). In this paper, we set out to (i) demonstrate novel approaches for quantifying surprising road user behavior based on behavior predictions obtained from state-of-the-art machine-learned generative models and (ii) illustrate for the first time, to the best of our knowledge, how surprising human behavior can be objectively detected in complex, dynamic environments like road traffic using real-world driving events collected by Waymo vehicles.

Key Insights

Generative Models

Surprise can be generally conceptualized as a violation of an agent’s subjective belief about the state of the world. An agent’s belief distributions come from their generative model, which, in simple terms, is the agent’s internal representation of the world that generates its expectations of incoming sensory signals (Friston and Price, 2001; Bruineberg et al., 2018). For example, generative models are typically defined analytically by a Partially Observable Markov Decision Process (POMDP) for discrete-time problems or stochastic differential equations for continuous-time problems (Parr et al., 2022, Chapter 4). To scale to complex real-world problems like road traffic, machine-learned function approximators like neural networks can be used as generative models (Tschantz et al., 2020). Our generative model is an evolution of the Multipath model (Chai et al., 2019) using the Wayformer encoder (Nayakanti et al., 2022). Put simply, our model learns to predict probability distributions over future road user positions by observing real-world traffic situations.

Residual Information & Antithesis Surprise

In this paper, we proposed two new surprise metrics, Residual Information and Antithesis surprise, which respectively belong to the probabilistic mismatch and belief mismatch surprise categories proposed by Modirshanechi et al. (2022). Probabilistic mismatch surprise metrics compare an observed state to a prior belief distribution obtained from the generative model, while belief mismatch surprise compares two belief distributions.

Residual Information solves several practical problems we’ve encountered when applying common existing surprise measures to the road traffic domain. For example, many constructs in information theory, including surprisal (i.e., Shannon surprise), assume discrete/categorical probability distributions (Marsh, 2013). We, however, consider a continuous distribution over future positions of agents. So, our methods apply to any distribution, whether discrete or continuous, and explicitly consider the information acquisition process over time.

Probabilistic mismatch surprise measures detect any observation which was unlikely under our prior beliefs, even if this observation has no bearing on our subsequent beliefs. In contrast, belief mismatch surprise measures specifically detect consequential information with the power to change our beliefs. This allows us to measure changes in our predictions about future outcomes, which has the advantage of implicitly considering higher time derivatives of the predicted quantity. For example, a sudden but significant deceleration will cause a large change in the predicted future position, even if it has not yet significantly affected the vehicle’s current position. The same applies to heading or tire angle changes, allowing us to identify certain surprising actions earlier.

On the other hand, consider a vehicle driving down the highway with its turn signal on. Are they about to change lanes? Did the driver forget to turn on their signal? Both outcomes are within expectations. Therefore, evidence for either of these hypotheses is unsurprising. But, if a vehicle suddenly slams its brakes to avoid a previously unseen pedestrian, it may surprise the following vehicle’s driver quite profoundly. To combine these features, we designed Antithesis surprise to detect the increased likelihood of a previously unexpected outcome and to silence “unsurprising” information gain.

Applications

Our surprise measures have various applications in traffic safety. Here we discuss three main areas, Traffic Conflict Definition, Response Timing Modeling, and Driving Behavior Evaluation.

Traffic Conflict Definition: By combining our surprise metric with spatiotemporal proximity metrics, we can accurately identify conflicts. Road User Response Timing Modeling: Measuring and modeling road user response timing in naturalistic traffic conflicts is challenging, particularly because there is often no clear-cut stimulus onset to “start the clock” for a response time measurement. With surprise, response timing in naturalistic scenarios can be modeled as belief updating in the face of surprising evidence (see Engström et al., 2022). Driving Behavior Evaluation: Our surprise models can be used more broadly to evaluate the quality of driving behavior for both human and autonomous drivers. Our surprise models offer a way to precisely operationalize road user predictability into driving behavior metrics that can be used both offline during AV development and as part of the onboard automated driving system itself [1].

Between the lines

Avoiding surprise can be seen as generally important for the safety and alignment of AI to human users. While this paper focused on the context of road traffic and AVs, our framework is generalizable to any domain where a generative model can be trained. The generative model used in this paper made predictions at lower levels of abstraction (e.g., position), but our methods can be generalized to more abstract states (e.g., pass/yield).

We have here conceptualized surprise specifically as a violation of expectations of an external state (e.g., another road user’s behavior). However, the active inference framework suggests a more general notion of surprise based on the idea that (i) the generative model not only predicts external events but also one’s control actions and their consequences and (ii) the predictions represent the agent’s preferred observations or states, where the agent’s behavior results from it seeking to maximize the evidence for its generative model, which is equivalent to minimizing surprise (Parr et al., 2022). While our models focus on surprise related to external events, recent work such as Wei et al. (2022, 2023b, a) has started exploring this more general notion of surprise, opening up interesting new paths for future road user behavior modeling.

References

[1] These and other techniques may be described in, e.g., U.S. Patent No. 11,447,142; U.S. Patent App. No. 17/946,973; U.S. Patent App. No. 17/399,418; U.S. Patent App. No. 63/397,771; U.S. Patent App. No. 63/433,717; and U.S. Patent App. No. 63/460,815.

Bruineberg, J., Rietveld, E., Parr, T., van Maanen, L., Friston, K.J., 2018. Free-energy minimization in joint agent-environment systems: A niche construction perspective. Journal of theoretical biology 455, 161-178.

Chai, Y., Sapp, B., Bansal, M., Anguelov, D., 2019. Multipath: Multiple probabilistic anchor trajectory hypotheses for behavior prediction. arXiv preprint arXiv:1910.05449 .

De Freitas, J., Censi, 452 A., Walker Smith, B., Di Lillo, L., Anthony, S.E., Frazzoli, E., 2021. From driverless dilemmas to more practical commonsense tests for automated vehicles. Proceedings of the national academy of sciences 118, e2010202118.

Engström, J., Liu, S.Y., Dinparastdjadid, A., Simoiu, C., 2022. Modeling road user response timing in naturalistic settings: a surprise-based framework. arXiv preprint arXiv:2208.08651.

Friston, K.J., Price, C.J., 2001. Dynamic representations and generative models of brain function. Brain research bulletin 54, 275{285.

ISO/TR 21974-1:2018, 2018. Naturalistic driving studies { Vocabulary – Part 1. Naturalistic driving studies { Vocabulary – Part 1. International Organization for Standardization.

Marsh, C., 2013. Introduction to continuous entropy. Department of Computer Science, Princeton University 1034.

Modirshanechi, A., Brea, J., Gerstner, W., 2022. A taxonomy of surprise de nitions. Journal of Mathematical Psychology 110, 102712.

Parr, T., Pezzulo, G., Friston, K.J., 2022. Active inference: the free energy principle in mind, brain, and behavior. MIT Press.

Tschantz, A., Baltieri, M., Seth, A.K., Buckley, C.L., 2020. Scaling active inference, in: 2020 international joint conference on neural networks (ijcnn), IEEE. pp. 1{8.

Wei, R., McDonald, A.D., Garcia, A., Alambeigi, H., 2022. Modeling driver responses to automation failures with active inference. IEEE Transactions on Intelligent Transportation Systems 23, 18064{18075.

Wei, R., McDonald, A.D., Garcia, A., Markkula, G., Engstrom, J., O’Kelly, M., 2023b. An active inference model of car following: Advantages and applications. arXiv preprint arXiv:2303.15201.