The Challenge of Understanding What Users Want: Inconsistent Preferences and Engagement Optimization

🔬 Research Summary by Manish Raghavan, a postdoctoral fellow at the Harvard Center for Research for Computation and Society and an incoming assistant professor at MIT’s Sloan School of Management and Department of Electrical Engineering and Computer Science.

[Original paper by Jon Kleinberg, Sendhil Mullainathan, Manish Raghavan]

Overview: Engagement optimization is a foundational driving force behind online platforms: to understand what users want, platforms look at what they do and choose content for them based on their past behavior. But both research and personal experience tell us that what we do is not always a reflection of what we want; we can behave myopically and browse mindlessly, behaviors that are all too familiar online. In this paper, we develop a model to investigate the consequences of engagement-optimization when users’ behaviors do not align with their underlying values.

Introduction

There is a pervasive sense that online platforms are failing to provide a genuinely satisfying experience to users. A natural explanation is that platforms are optimizing something other than user happiness, such as clicks or ad revenue. We argue here that there is a deeper problem: a mismatch between our intuitive understanding of people and our formal models of users. Platforms typically make a kind of revealed preference assumption: what users do reveals what they want. But in practice, and especially on online platforms, this assumption is false. We behave in ways we regret in hindsight.

In this paper, we develop a user model that takes into account the fact that we often have inconsistent preferences: our myopic, impulsive self may make decisions now that our thoughtful, reflective self later regrets. Our model reveals how engagement optimization without taking into account these inconsistencies can lead to worse experiences for users. Importantly, this depends on the type of content involved: engagement-optimization can have different impacts for celebrity news videos compared to science videos. Our intuition might lead us to believe that engagement-optimization is akin to producing more and more addictive “junk food,” leading to unhealthy overconsumption. But our model suggests more nuance: while some types of content behave like junk food, others may behave like healthy salads, and teasing apart the difference is key to understanding what users want.

Key Insights

Goal

Our aim in this work is to model how a user with inconsistent preferences interacts with a platform’s engagement optimization. Drawing on “multiple selves” theories from the behavioral literature, we consider a user with two potentially conflicting selves: one is myopic and impulsive (which we denote system 1), and the other is reflective (system 2). While system 1 often drives consumption on online platforms, system 2 reflects the user’s long-term values.

Model

We present a stylized model in which a user consumes a sequence of content until they decide to leave. In order to model diminishing returns for content consumption, we assume system 2 only gets value from remaining on the platform until a random time T, at which point they no longer derive value from the platform. We call this the content’s span: how long it remains appealing to system 2. A related quantity is the content’s value, the enjoyment system 2 gets out of each piece of content. We assume the user has some outside option they could derive value from if they left the platform.

A user with consistent preferences would leave the platform as soon as system 2 stops deriving value from it; in our model, as long as system 1 finds the current content appealing, the agent remains on the platform. We call content’s appeal to system 1 its moreishness. Thus, the agent over-consumes relative to their true preferences. As a result, a platform observing this behavior may mistake overconsumption for true enjoyment of the content.

Content manifolds and optimization

The user’s behavior thus depends on the content’s value (relative to their outside option), span, and moreishness. These parameters also determine the utility that the user derives from the platform. Crucially, behavior and utility are not necessarily aligned in our model: some content may lead to high consumption but low utility, and some content may produce low consumption and high utility.

We can think of the content available to a platform as lying on some manifold in the (value, span, moreishness) space. As a platform optimizes for engagement, they implicitly optimize over this manifold, choosing the point with highest engagement. But because engagement and utility can be misaligned, this may not be the content that the user would be happiest with. Importantly, this misalignment isn’t the same for all types of content. We might intuitively expect engagement-optimization to have different consequences for educational videos relative to viral videos.

We can make this intuition more formal. In particular, we show two distinct reasons for the misalignment between engagement and utility:

Content has high moreishness
High-value content has low span, and low-value content has high span

In the first case, high moreishness clearly contributes to this misalignment because users mindlessly consume without deriving any value from the platform. The second case is slightly more subtle, and can be illustrated by the example of a user searching for a helpful tutorial video. If the user can quickly find a tutorial that clearly explains the ideas in question, then they have achieved high utility with very low engagement. If, on the other hand, that concise video were replaced with a series of more confusing videos that required the user to spend more time to understand them, the user’s engagement would seem to increase even though their overall utility decreased.

Understanding the reasons for this misalignment allows us to formally describe when engagement-optimization leads to reasonably good outcomes. We show that under certain conditions on the content manifold, optimizing engagement leads to near-optimal user utility. Intuitively, this suggests that there are types of content for which engagement-optimization is relatively “safe,” and there are types of content for which it leads to unhealthy outcomes.

Understanding the difference between junk food and salad

This distinction between content where utility grows with engagement and content where it doesn’t (which we intuitively term “salad” and “junk food”) is important for a platform to understand. In the simplest form of our model, platforms do not have the information to make this determination. But in practice, platforms have auxiliary sources of data that, in conjunction with our model, can provide insight. We discuss a few such sources. Platforms often use surveys to better understand users’ happiness, and we show that survey data could help to learn more about the type of content being shown. We can describe some of the data that platforms collect as more value-driven: for example, how often a user returns to a platform may indicate some combination of system 1 and system 2 preferences, and this data may reveal more about the user’s true utility. And finally, certain UI design choices like break suggestions and autoplay can provide insight: if consumption of a certain type of content decreases when autoplay is off, this might indicate that this content is particularly moreish.

Between the lines

There’s a growing understanding that there’s something not quite right with platform optimization based on user behavioral data, even ignoring the privacy concerns and financial incentives involved. Platforms appear to increasingly realize that no matter how carefully they measure engagement, using it as a metric doesn’t quite lead to a product users are actually happy with. This paper tries to address these issues by arguing that no matter how sophisticated the model of engagement, it cannot be effective unless it acknowledges that users have internal conflicts in their preferences. This realization has led to two kinds of changes. On the design side, it has led to experiments with UI changes to see if they improve happiness (such as time limits). On the content optimization side, it has resulted in attempts to augment passive user behavior data with explicit survey measures of happiness or satisfaction. In both cases, the approaches are largely based on intuition. Our model provides a framework for thinking about why engagement optimization may be failing.