🔬 Research Summary by William Seymour, a Research Associate in Informatics at King’s College London.
[Original paper by William Seymour, Mark Coté, Jose Such]
Overview: Instead of requiring users to switch to companion apps, domestic voice assistants (VAs) like Alexa are beginning to let people give verbal consent to data sharing. This work describes ways that current implementations break key informed consent principles and outlines open questions for the research and developer communities around the future development of verbal consent.
Introduction
Dealing with apps that ask for permissions can be a pain. The situation is even worse for voice assistants like Alexa, where you need to find your phone to let skills use your data. To address this, platforms are moving this into the conversation, meaning that skills can verbally ask for consent to access your information. While this is great from a usability perspective, current ways of doing so violate established principles of informed consent. By looking at key areas of data protection regulation, academic research, and perspectives from research ethics, this ongoing work highlights issues including time pressures, interface asymmetry, and insufficient delineation between skills and system software. We also offer solutions for developers and platform vendors and five open questions to the research community to help shape the development of usable and effective verbal consent for voice assistants and similar conversational user interfaces.
Key Insights
The last decade has shown us that behavioral data is an extremely valuable commodity that is capable of powering vast online advertising empires and influencing elections. In response to the effects of what has aptly been termed “surveillance capitalism,” regulators have sought to protect people’s ability to choose what information they share. The GDPR is the foremost example of this, but a consequence of these laws is that every day we get pestered by websites, apps, skills, etc., asking for consent to harvest personal data. This might take the form of leaving cookies in your web browser, accessing your smartphone location, or a VA skill asking for your name, but all are shaped by the same underlying legal requirements.
(1) User: Alexa, open Ride Hailer.
(2) Alexa: Welcome to Ride Hailer. Where would you like to go?
(3) User: The Space Needle.
(4) Alexa: Sure. I need access to your name and current location, and mobile number so that I can find a ride for you.
(5) Alexa: Do you give Ride Hailer permission to access your name, current location, and mobile number? You can say ‘I approve’ or ‘No.’
(6) User: I approve.
(7) Alexa: Thank you. A ride to the Space Needle from your current location will cost fifteen dollars, and the driver can pick you up in ten minutes. Do you want me to book it?
(8) User: Yes.
(9) Alexa: Great. Your driver will arrive in ten minutes.
Figure 1: Sample verbal consent flow from the Alexa Skills Kit documentation
You encounter the effects of this when you install a new app on your phone and get asked to permit it to use your data. Like with smartphone apps, voice assistants use a similar mechanism to allow skills to use personal data that, ironically for a service that sells itself as being hands-free, had to be done on a smartphone through a companion app. But that’s changing, and now skills can ask for permission to access your data in the conversation using your voice; the assistant asks if you consent, and you say “yes” (see Figure 1). This is a great win for usability, but the voice is a much more constrained way of interacting with a device than a graphical interface; as technology researchers, we spend a lot of time thinking about research ethics and the informed consent process, and we’re concerned that the current implementations of verbal consent for voice assistants violate some key principles of consent.
A good example is the inherent time pressure in interactions with voice assistants. After about eight seconds, most assistants will time out and re-prompt the user to respond. A key aspect of making decisions about what to share is that you can take as long as you need to make up your mind, but the cadence of these interactions gives rise to a pervasive sense of urgency that would persist even if the timeout period could be changed.
The speech interfaces that facilitate all interaction with voice assistants also lack the functionality to implement verbal consent properly. When you tick the boxes in the Alexa smartphone app to consent to a skill, it’s intuitive that unticking those boxes later revokes that consent. But if you grant the same permissions verbally, the voice commands that would rescind them are missing, making it much easier to give consent than to withdraw it.
Similarly, current interfaces lack any audible distinction between third-party skills that can largely do what they want and the assistant’s trusted operating system: in Figure 1, this happens between lines 4 and 5, but if you’re listening to the assistant speak, there’s no way of knowing this.
So how do we move past these problems? For platforms, a clear first step is an overhaul of voice interfaces for consent: make timeouts configurable, introduce voice commands to revoke consent, and ensure that system dialogue is audibly distinguishable from third-party speech. For developers, motivating the use of personal data is important when fostering trust – clearly tying requested permissions to skill functionality. Some platforms like Google Assistant are beginning to build this into how consent is programmed for skills, explicitly asking developers why they are collecting data.
As a research community, this shift to verbal consent is a great opportunity to cement ethical practices before there becomes a ‘standard’ way of doing things. As part of our ongoing research, we’re looking at information and choices it makes sense to deliver verbally and how other ways of interacting with VAs, like attached screens and companion smartphone apps, could enhance and support the verbal consent process.
Between the lines
At first, it might be difficult to see why verbal consent is such a key area of concern for voice assistants. Outside of a small number of malicious skills, most people want to let skills use their data because it’s required for the skill to do its job (like in the ride-hailing example above), and an unfortunate consequence of the GDPR has been an increasing sense of ‘consent fatigue’ amongst users. But the principles that underpin consent are as important as ever. While once unimaginable, we live in a present where data from web browsing and smartphone usage are used to identify government spies and prosecute people seeking abortions.
At the same time, we are moving towards a future where voice interfaces will be increasingly difficult to escape from. Most phones, TVs, and headphones already have built-in voice interfaces that are slowly being utilized for additional functionality. The precedents we set now with the devices in our homes will establish the norms that dictate how consent operates in a future where VAs are pervasive, both at work and at home.