• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Core Principles of Responsible AI
    • Accountability
    • Fairness
    • Privacy
    • Safety and Security
    • Sustainability
    • Transparency
  • Special Topics
    • AI in Industry
    • Ethical Implications
    • Human-Centered Design
    • Regulatory Landscape
    • Technical Methods
  • Living Dictionary
  • State of AI Ethics
  • AI Ethics Brief
  • 🇫🇷
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

Automated Interviewer or Augmented Survey? Collecting Social Data with Large Language Models

February 1, 2024

🔬 Research Summary by Alejandro Cuevas Villalba, Ph.D. student in Computer Science at Carnegie Mellon University, focusing on measuring social influence and improving reputation systems.

[Original paper by Alejandro Cuevas Villalba, Eva M. Brown, Jennifer V. Scurrell, Jason Entenmann, and Madeleine I. G. Daepp]


Overview: Quantitative data collection methods (e.g., surveys) often stand at odds with qualitative methods (e.g., interviews). Tools such as surveys enable researchers to collect and analyze data at scale but can constrain the depth and breadth of participants’ answers. On the other hand, tools such as interviews facilitate rich and nuanced data collection, though at the expense of scale. Undoubtedly, the advent of large language models (LLMs) offers unique opportunities to develop new data collection methods. However, should we think of LLMs as survey enhancers? Or should we think of them as automated interviewers?


Introduction

Advances in technology have often ushered in new eras of data collection methodologies. In the same way that the Internet has enabled the prevalence of web surveys, the adoption of telephones across households has replaced mail-based surveys. Today, the era of personal AI assistants is set to usher in a new wave of data collection tools.

While conversational agents (e.g., chatbots) have been around for several years, we may be finally reaching—and quickly surpassing—the point where these tools are actually a pleasure to interact with. Companies like OpenAI have managed to create remarkable user experiences and greatly reduced the burden of developing and deploying conversational agents.

Thus, a natural question emerged in the social sciences: how can we use AI assistants to facilitate data collection? We explored this question by designing and deploying a conversational agent to conduct a study with over 399 participants. Our findings suggest that there are numerous benefits to employing conversational agents over traditional surveys. Interestingly, these tools still fall short when compared to in-person interviews. What was most surprising, however, is that managing participants’ expectations is a key methodology element.

Key Insights

The limitations of surveys, interviews, and chatbots

Surveys have become a popular tool for data collection, but their prevalence is not due to their ability to extract superior insights. Rather, the ease of data analysis after collection is what turned them into the de facto tools of human data collection. However, surveys are not great exploratory tools. Composing questions, especially close-ended ones, is tough when we don’t yet know what to ask. This is where methods like interviews or focus groups excel because they allow interviewers to probe participants (i.e., ask subsequent questions) on areas they think are interesting. These methods are flexible yet substantially hard to scale. With great effort, an interviewer can talk to 8 participants a day. But the brunt of the work is in the analysis, where each hour of interview may amount to more than 3 hours of analysis.

The gap between both methods is—and has always been—quite broad. For a long time, researchers have sought methods that offer more flexibility, either by allowing us to scale interviews or by enhancing the richness of surveys. For quite some time, chatbots were seen as an option to bridge the gap. Nonetheless, the promise of chatbots was received with great disappointment. Whether it was the Q&A system for an airline or a robot receptionist answering the support line of a bank, most of us have experienced the distaste of interacting with these alleged conversational agents. Despite the significant advances in machine learning, specifically neural networks, a pleasant interaction with a chatbot seemed elusive.

Then came the LLMs with skyrocketing popularity thanks to OpenAI’s ChatGPT. This easy-to-use, know-it-all conversational agent captivated the world and reinstated confidence in chatbots. Not only are they easy to use, but recently, they have become even easier to deploy. With the new functionalities announced by OpenAI, users can now set up their custom chatbots with few lines of code. Yes, this means you can deploy a chatbot that talks like you or about niche things that you care about; the sky (or compute power) is the limit.

And yes, this also means we can use chatbots to create surveys and interview questions or have them conduct surveys or conduct interviews. But can it really do these tasks well? This is what we set forth to study. To do this, we designed three chatbots and recruited 399 participants to participate in a study about AI alignment. We split participants into three groups, each interacting with a different chatbot, and asked them to complete a survey about their experience.

Our study approach

Our study had three stages. First, participants took a multiple-choice survey on AI alignment. The purpose of the survey was twofold. First, it was a way to prime participants about the topic at hand. Second, it provided us with a benchmark by which we could assess our interpretations of the conversations. More on this later. After the first survey, participants were split randomly into three groups, each with a different chatbot design. The chatbots were programmed to ask questions about AI alignment. Our baseline was a chatbot that only asked hardcoded questions, whereas the two other chatbots relied on LLMs to display more intelligence. Lastly, participants were asked to complete an exit survey on their experience.

We found significant evidence to suggest that researchers have much to gain from employing chatbots to replace surveys. Interestingly, this was not the case when comparing chatbots to interviews. Compared to surveys, participants had better engagement and rated their experience significantly higher. On the other hand, when comparing the chatbot to in-person interviews, participants found themselves preferring a human-to-human interview.

Among the most interesting key insights was an accidental discovery in our methodology. In the survey that followed the chatbot interaction, we referred to the chatbot as an “AI interviewer.” This framing was particularly salient to participants in the baseline group, who interacted with the baseline chatbot. Several of the participants expressed frustration and disappointment at the fact that the chatbot did not seem intelligent. On the other hand, this effect was absent from the other groups. Rather, participants expressed that they much preferred their interaction with the chatbot over traditional surveys.

Between the lines

The ease of deployment and enjoyment of participants bodes well for using chatbots as data collection instruments in user studies. Soon, however, we should consider them as survey augmenters rather than replacements for in-person interviews. Furthermore, the missing puzzle piece from our work is scaling the analyses of the collected data. Recent work has shown that LLMs may also assist in analyzing the chatlogs. Although outside the scope of this paper, we found encouraging preliminary results when analyzing the collected data with ChatGPT.

With careful management of user expectations, we could introduce a new tool for user studies: a tool that allows us to explore new phenomena at a greater scale more quickly and deeply.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

AI Policy Corner: Frontier AI Safety Commitments, AI Seoul Summit 2024

AI Policy Corner: The Colorado State Deepfakes Act

Special Edition: Honouring the Legacy of Abhishek Gupta (1992–2024)

AI Policy Corner: The Turkish Artificial Intelligence Law Proposal

From Funding Crisis to AI Misuse: Critical Digital Rights Challenges from RightsCon 2025

related posts

  • Artificial Intelligence: the global landscape of ethics guidelines

    Artificial Intelligence: the global landscape of ethics guidelines

  • Research summary: Beyond Near- and Long-Term: Towards a Clearer Account of Research Priorities in AI...

    Research summary: Beyond Near- and Long-Term: Towards a Clearer Account of Research Priorities in AI...

  • Artificial Intelligence - Application to the Sports Industry (Research summary)

    Artificial Intelligence - Application to the Sports Industry (Research summary)

  • Responsible Design Patterns for Machine Learning Pipelines

    Responsible Design Patterns for Machine Learning Pipelines

  • REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research

    REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research

  • Balancing Data Utility and Confidentiality in the 2020 US Census

    Balancing Data Utility and Confidentiality in the 2020 US Census

  • I Don't Want Someone to Watch Me While I'm Working: Gendered Views of Facial Recognition Technolog...

    "I Don't Want Someone to Watch Me While I'm Working": Gendered Views of Facial Recognition Technolog...

  • Responsibility assignment won’t solve the moral issues of artificial intelligence

    Responsibility assignment won’t solve the moral issues of artificial intelligence

  • Research summary: Social Biases in NLP Models as Barriers for Persons with Disabilities

    Research summary: Social Biases in NLP Models as Barriers for Persons with Disabilities

  • The Impact of the GDPR on Artificial Intelligence

    The Impact of the GDPR on Artificial Intelligence

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer

Categories


• Blog
• Research Summaries
• Columns
• Core Principles of Responsible AI
• Special Topics

Signature Content


• The State Of AI Ethics

• The Living Dictionary

• The AI Ethics Brief

Learn More


• About

• Open Access Policy

• Contributions Policy

• Editorial Stance on AI Tools

• Press

• Donate

• Contact

The AI Ethics Brief (bi-weekly newsletter)

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.


Archive

  • © MONTREAL AI ETHICS INSTITUTE. All rights reserved 2024.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.