• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

AI Consent Futures: A Case Study on Voice Data Collection with Clinicians

August 22, 2023

🔬 Research Summary by Lauren Wilcox, Ph.D. (she/her) is a Senior Staff Research Scientist and Group Manager of the Technology, AI, Society, and Culture (TASC) team, part of Responsible AI and Human-Centered Technology in Google Research.

[Original paper by Lauren Wilcox, Robin Brewer, and Fernando Diaz]


Overview:

Artificial intelligence (AI) applications, including those based on machine learning (ML) models, are increasingly developed for high-stakes domains such as digital health and health care. This paper foregrounds clinicians’ perspectives on new forms of data collection that are actively being proposed to enable AI-based assistance with clinical documentation tasks. We examined the prospective benefits and harms of voice data collection during health consultations, highlighting eight classes of potential risks that clinicians are concerned about, including clinical workflow disruptions, self-censorship, and errors that could impact patient eligibility for services.


Introduction

There is much discussion about the benefits of AI in healthcare and much less discussion of risks and potential harms. In this paper, we delved deeper into the ethical dimensions of collecting voice data in clinical settings for training and interacting with machine learning models like LLMs and speech-based AI assistants.  Our study uses informed consent as a frame and includes participatory activities with physicians in primary care, urgent care, and emergency medicine. Study findings emphasize how data collection practices can jeopardize the safety and trust of both patients and doctors. The study also demonstrates how speculative design methods can be used to surface situated ethics issues related to dataset construction, language model training, and real-time model use. 

Key Insights

In this paper, we report physicians’ perspectives on the collection of conversational voice data for an AI documentation assistant in the context of their consultations. We recruited a purposeful sample of 16 physicians who currently practice primary care, urgent care, or emergency medicine. Participants had a diverse range of residency training, worked in various institution types in the U.S., and covered diverse patient populations (e.g., pediatric, geriatric, patients with chronic conditions, patients with disabilities, and unhoused patients).

We used design fiction as an approach to enable participants to envision and reason about aspects of alternate futures associated with near-term technology. In this case, we positioned the informed consent process as a design fiction to elicit perspectives on the benefits and risks of collecting conversational voice data in health consultations, to enable interaction with and training language models (LMs). Physicians conceptualized such benefits and risks based on their experiences treating patients in primary care, urgent care, and emergency settings and overseeing patient consent processes in these settings. 

Benefits of voice data collection to interact with and train models include a focus on the patient but depend on contingencies

Doctors thought that data collected–if also made available to patients–could be a way to supplement their memory of important information associated with the patient visit,  enable better translation of medical instructions, and aid in their understanding of their health. Voice data could be used to preserve patients’ direct words, reducing interpretation biases and providing more contextualized information in the medical record.

On the other hand, physicians discussed the need to mitigate the risks of sharing such data. Almost all physicians who discussed benefits also discussed their contingent nature (e.g., dependency on “perfect” voice recognition for the net benefit to be realized, availability of control capabilities to select specific audio segments for an AI assistant to use). Below, we summarize the classes of risks our participants surfaced.

Prospective risks highlight physicians’ concern for both patients and clinicians

Eroding Trust in Clinicians and Health Care Institutions

Physicians were concerned that collecting voice data during consultation could degrade trust between doctors and patients. This is especially concerning in communities with a history of marginalization. The knowledge that one’s voice data is being collected could affect what doctors and patients say. This could range from being excessively guarded when communicating about using formalities or disclaimers by doctors, which could erode relationship building and trust.

Self-Censorship

Physicians worry that patients, being aware of data collection, might withhold crucial information from their consultation, impacting the quality of care they receive and the patient–doctor relationship.

Care Obstructions

Unlike current practices where clinicians might avoid documenting unwarranted patient statements, voice data collection could capture everything, including jests or statements not meant seriously. Lacking human context, a model might not distinguish between serious remarks and offhand comments or jokes. Verbatim remarks by patients could be misinterpreted, leading to harm such as ineligibility for services. 

Legal risks for patients and clinicians

Physicians questioned the discoverability of voice data in legal situations and the implications of unintentional capture of other people that could violate their privacy. They were also concerned about who can access the data and whether patients have the right to it later.

Patients’ inability to consent to data collection

Assessing a patient’s capability to consent can be challenging. Proxies might be required, but their involvement can complicate the process further. Relying solely on legal protocols for consent might not suffice for a truly patient-centric experience.

Workflow disruptions and additional clinician labor

The potential introduction of voice data collection and the use of AI assistants could disrupt the established workflow of the healthcare team. Physicians anticipate that patients might have questions about AI technology and its implications, which could take away valuable consultation time. Patients might have concerns about where the data is stored and managed and how to request that specific statements be excluded. These concerns require careful, coordinated communication, transparency capabilities, and explanations.

Privacy

Physicians were concerned about the potential for insurance companies or other third parties to access voice data. Ensuring patient confidentiality and preventing misuse of information was a paramount concern. Additionally, certain sensitive issues, like intimate partner violence, might be unsafe to document without putting the patient at risk.

Inaccuracies in speech recognition and documentation leading to downstream harms

Both clinicians and patients might spend more time clarifying or editing records if the information is captured inaccurately. Language barriers, use of interpreters, and non-native speakers have traditionally been inadequately represented in speech technologies. Misinterpreting initial discussions about potential problems and evolving diagnoses could lead to erroneous patient records.

Between the lines

The paper challenges the current narratives of AI being almost entirely assistive and highlights the gravity of decisions relating to data collection and the AI implementation process. While voice data collected for the purposes of interacting with and training AI models is seen by many companies and institutions as a solution to streamline documentation, the act of collecting voice data to enable these technologies brings with it significant concerns related to trust, legal implications, consent, workflow disruption, privacy, and accuracy. 

Framing the informed consent process as a design fiction supported critical consideration of near-term voice data collection scenarios. Grounding speculative activities in the context of consent allowed for generative thinking about risks and benefits in a real-world context to base expected possibilities on situated experiences.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

ALL IN Conference 2025: Four Key Takeaways from Montreal

Beyond Dependency: The Hidden Risk of Social Comparison in Chatbot Companionship

AI Policy Corner: Restriction vs. Regulation: Comparing State Approaches to AI Mental Health Legislation

Beyond Consultation: Building Inclusive AI Governance for Canada’s Democratic Future

AI Policy Corner: U.S. Executive Order on Advancing AI Education for American Youth

related posts

  • The Ethics of AI Value Chains: An Approach for Integrating and Expanding AI Ethics Research, Practic...

    The Ethics of AI Value Chains: An Approach for Integrating and Expanding AI Ethics Research, Practic...

  • On Measuring Fairness in Generative Modelling (NeurIPS 2023)

    On Measuring Fairness in Generative Modelling (NeurIPS 2023)

  • Prediction Sensitivity: Continual Audit of Counterfactual Fairness in Deployed Classifiers

    Prediction Sensitivity: Continual Audit of Counterfactual Fairness in Deployed Classifiers

  • Bias in Automated Speaker Recognition

    Bias in Automated Speaker Recognition

  • The Participatory Turn in AI Design: Theoretical Foundations and the Current State of Practice

    The Participatory Turn in AI Design: Theoretical Foundations and the Current State of Practice

  • Why was your job application rejected: Bias in Recruitment Algorithms? (Part 1)

    Why was your job application rejected: Bias in Recruitment Algorithms? (Part 1)

  • Combatting Anti-Blackness in the AI Community

    Combatting Anti-Blackness in the AI Community

  • Research summary: Aligning Super Human AI with Human Behavior: Chess as a Model System

    Research summary: Aligning Super Human AI with Human Behavior: Chess as a Model System

  • Responsible sourcing and the professionalization of data work

    Responsible sourcing and the professionalization of data work

  • Research summary: Challenging Truth and Trust: A Global Inventory of Organized Social Media Manipula...

    Research summary: Challenging Truth and Trust: A Global Inventory of Organized Social Media Manipula...

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • © 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.