• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Core Principles of Responsible AI
    • Accountability
    • Fairness
    • Privacy
    • Safety and Security
    • Sustainability
    • Transparency
  • Special Topics
    • AI in Industry
    • Ethical Implications
    • Human-Centered Design
    • Regulatory Landscape
    • Technical Methods
  • Living Dictionary
  • State of AI Ethics
  • AI Ethics Brief
  • 🇫🇷
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

AI Consent Futures: A Case Study on Voice Data Collection with Clinicians

August 22, 2023

🔬 Research Summary by Lauren Wilcox, Ph.D. (she/her) is a Senior Staff Research Scientist and Group Manager of the Technology, AI, Society, and Culture (TASC) team, part of Responsible AI and Human-Centered Technology in Google Research.

[Original paper by Lauren Wilcox, Robin Brewer, and Fernando Diaz]


Overview:

Artificial intelligence (AI) applications, including those based on machine learning (ML) models, are increasingly developed for high-stakes domains such as digital health and health care. This paper foregrounds clinicians’ perspectives on new forms of data collection that are actively being proposed to enable AI-based assistance with clinical documentation tasks. We examined the prospective benefits and harms of voice data collection during health consultations, highlighting eight classes of potential risks that clinicians are concerned about, including clinical workflow disruptions, self-censorship, and errors that could impact patient eligibility for services.


Introduction

There is much discussion about the benefits of AI in healthcare and much less discussion of risks and potential harms. In this paper, we delved deeper into the ethical dimensions of collecting voice data in clinical settings for training and interacting with machine learning models like LLMs and speech-based AI assistants.  Our study uses informed consent as a frame and includes participatory activities with physicians in primary care, urgent care, and emergency medicine. Study findings emphasize how data collection practices can jeopardize the safety and trust of both patients and doctors. The study also demonstrates how speculative design methods can be used to surface situated ethics issues related to dataset construction, language model training, and real-time model use. 

Key Insights

In this paper, we report physicians’ perspectives on the collection of conversational voice data for an AI documentation assistant in the context of their consultations. We recruited a purposeful sample of 16 physicians who currently practice primary care, urgent care, or emergency medicine. Participants had a diverse range of residency training, worked in various institution types in the U.S., and covered diverse patient populations (e.g., pediatric, geriatric, patients with chronic conditions, patients with disabilities, and unhoused patients).

We used design fiction as an approach to enable participants to envision and reason about aspects of alternate futures associated with near-term technology. In this case, we positioned the informed consent process as a design fiction to elicit perspectives on the benefits and risks of collecting conversational voice data in health consultations, to enable interaction with and training language models (LMs). Physicians conceptualized such benefits and risks based on their experiences treating patients in primary care, urgent care, and emergency settings and overseeing patient consent processes in these settings. 

Benefits of voice data collection to interact with and train models include a focus on the patient but depend on contingencies

Doctors thought that data collected–if also made available to patients–could be a way to supplement their memory of important information associated with the patient visit,  enable better translation of medical instructions, and aid in their understanding of their health. Voice data could be used to preserve patients’ direct words, reducing interpretation biases and providing more contextualized information in the medical record.

On the other hand, physicians discussed the need to mitigate the risks of sharing such data. Almost all physicians who discussed benefits also discussed their contingent nature (e.g., dependency on “perfect” voice recognition for the net benefit to be realized, availability of control capabilities to select specific audio segments for an AI assistant to use). Below, we summarize the classes of risks our participants surfaced.

Prospective risks highlight physicians’ concern for both patients and clinicians

Eroding Trust in Clinicians and Health Care Institutions

Physicians were concerned that collecting voice data during consultation could degrade trust between doctors and patients. This is especially concerning in communities with a history of marginalization. The knowledge that one’s voice data is being collected could affect what doctors and patients say. This could range from being excessively guarded when communicating about using formalities or disclaimers by doctors, which could erode relationship building and trust.

Self-Censorship

Physicians worry that patients, being aware of data collection, might withhold crucial information from their consultation, impacting the quality of care they receive and the patient–doctor relationship.

Care Obstructions

Unlike current practices where clinicians might avoid documenting unwarranted patient statements, voice data collection could capture everything, including jests or statements not meant seriously. Lacking human context, a model might not distinguish between serious remarks and offhand comments or jokes. Verbatim remarks by patients could be misinterpreted, leading to harm such as ineligibility for services. 

Legal risks for patients and clinicians

Physicians questioned the discoverability of voice data in legal situations and the implications of unintentional capture of other people that could violate their privacy. They were also concerned about who can access the data and whether patients have the right to it later.

Patients’ inability to consent to data collection

Assessing a patient’s capability to consent can be challenging. Proxies might be required, but their involvement can complicate the process further. Relying solely on legal protocols for consent might not suffice for a truly patient-centric experience.

Workflow disruptions and additional clinician labor

The potential introduction of voice data collection and the use of AI assistants could disrupt the established workflow of the healthcare team. Physicians anticipate that patients might have questions about AI technology and its implications, which could take away valuable consultation time. Patients might have concerns about where the data is stored and managed and how to request that specific statements be excluded. These concerns require careful, coordinated communication, transparency capabilities, and explanations.

Privacy

Physicians were concerned about the potential for insurance companies or other third parties to access voice data. Ensuring patient confidentiality and preventing misuse of information was a paramount concern. Additionally, certain sensitive issues, like intimate partner violence, might be unsafe to document without putting the patient at risk.

Inaccuracies in speech recognition and documentation leading to downstream harms

Both clinicians and patients might spend more time clarifying or editing records if the information is captured inaccurately. Language barriers, use of interpreters, and non-native speakers have traditionally been inadequately represented in speech technologies. Misinterpreting initial discussions about potential problems and evolving diagnoses could lead to erroneous patient records.

Between the lines

The paper challenges the current narratives of AI being almost entirely assistive and highlights the gravity of decisions relating to data collection and the AI implementation process. While voice data collected for the purposes of interacting with and training AI models is seen by many companies and institutions as a solution to streamline documentation, the act of collecting voice data to enable these technologies brings with it significant concerns related to trust, legal implications, consent, workflow disruption, privacy, and accuracy. 

Framing the informed consent process as a design fiction supported critical consideration of near-term voice data collection scenarios. Grounding speculative activities in the context of consent allowed for generative thinking about risks and benefits in a real-world context to base expected possibilities on situated experiences.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

AI Policy Corner: Frontier AI Safety Commitments, AI Seoul Summit 2024

AI Policy Corner: The Colorado State Deepfakes Act

Special Edition: Honouring the Legacy of Abhishek Gupta (1992–2024)

AI Policy Corner: The Turkish Artificial Intelligence Law Proposal

From Funding Crisis to AI Misuse: Critical Digital Rights Challenges from RightsCon 2025

related posts

  • Zoom Out and Observe: News Environment Perception for Fake News Detection

    Zoom Out and Observe: News Environment Perception for Fake News Detection

  • Research summary: Troubling Trends in Machine Learning Scholarship

    Research summary: Troubling Trends in Machine Learning Scholarship

  • Towards A Unified Utilitarian Ethics Framework for Healthcare Artificial Intelligence

    Towards A Unified Utilitarian Ethics Framework for Healthcare Artificial Intelligence

  • Moral Machine or Tyranny of the Majority?

    Moral Machine or Tyranny of the Majority?

  • Research summary: AI Mediated Exchange Theory by Xiao Ma and Taylor W. Brown

    Research summary: AI Mediated Exchange Theory by Xiao Ma and Taylor W. Brown

  • A Survey on Intersectional Fairness in Machine Learning: Notions, Mitigation and Challenges

    A Survey on Intersectional Fairness in Machine Learning: Notions, Mitigation and Challenges

  • The TESCREAL Bundle: Eugenics and the promise of utopia through artificial general intelligence

    The TESCREAL Bundle: Eugenics and the promise of utopia through artificial general intelligence

  • Fair allocation of exposure in recommender systems

    Fair allocation of exposure in recommender systems

  • Disaster City Digital Twin: A Vision for Integrating Artificial and Human Intelligence for Disaster ...

    Disaster City Digital Twin: A Vision for Integrating Artificial and Human Intelligence for Disaster ...

  • Measuring Value Understanding in Language Models through Discriminator-Critique Gap

    Measuring Value Understanding in Language Models through Discriminator-Critique Gap

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer

Categories


• Blog
• Research Summaries
• Columns
• Core Principles of Responsible AI
• Special Topics

Signature Content


• The State Of AI Ethics

• The Living Dictionary

• The AI Ethics Brief

Learn More


• About

• Open Access Policy

• Contributions Policy

• Editorial Stance on AI Tools

• Press

• Donate

• Contact

The AI Ethics Brief (bi-weekly newsletter)

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.


Archive

  • © MONTREAL AI ETHICS INSTITUTE. All rights reserved 2024.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.