• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
Montreal AI Ethics Institute

Montreal AI Ethics Institute

Democratizing AI ethics literacy

  • Articles
    • Public Policy
    • Privacy & Security
    • Human Rights
      • Ethics
      • JEDI (Justice, Equity, Diversity, Inclusion
    • Climate
    • Design
      • Emerging Technology
    • Application & Adoption
      • Health
      • Education
      • Government
        • Military
        • Public Works
      • Labour
    • Arts & Culture
      • Film & TV
      • Music
      • Pop Culture
      • Digital Art
  • Columns
    • AI Policy Corner
    • Recess
    • Tech Futures
  • The AI Ethics Brief
  • AI Literacy
    • Research Summaries
    • AI Ethics Living Dictionary
    • Learning Community
  • The State of AI Ethics Report
    • Volume 7 (November 2025)
    • Volume 6 (February 2022)
    • Volume 5 (July 2021)
    • Volume 4 (April 2021)
    • Volume 3 (Jan 2021)
    • Volume 2 (Oct 2020)
    • Volume 1 (June 2020)
  • About
    • Our Contributions Policy
    • Our Open Access Policy
    • Contact
    • Donate

Automated Interviewer or Augmented Survey? Collecting Social Data with Large Language Models

February 1, 2024

🔬 Research Summary by Alejandro Cuevas Villalba, Ph.D. student in Computer Science at Carnegie Mellon University, focusing on measuring social influence and improving reputation systems.

[Original paper by Alejandro Cuevas Villalba, Eva M. Brown, Jennifer V. Scurrell, Jason Entenmann, and Madeleine I. G. Daepp]


Overview: Quantitative data collection methods (e.g., surveys) often stand at odds with qualitative methods (e.g., interviews). Tools such as surveys enable researchers to collect and analyze data at scale but can constrain the depth and breadth of participants’ answers. On the other hand, tools such as interviews facilitate rich and nuanced data collection, though at the expense of scale. Undoubtedly, the advent of large language models (LLMs) offers unique opportunities to develop new data collection methods. However, should we think of LLMs as survey enhancers? Or should we think of them as automated interviewers?


Introduction

Advances in technology have often ushered in new eras of data collection methodologies. In the same way that the Internet has enabled the prevalence of web surveys, the adoption of telephones across households has replaced mail-based surveys. Today, the era of personal AI assistants is set to usher in a new wave of data collection tools.

While conversational agents (e.g., chatbots) have been around for several years, we may be finally reaching—and quickly surpassing—the point where these tools are actually a pleasure to interact with. Companies like OpenAI have managed to create remarkable user experiences and greatly reduced the burden of developing and deploying conversational agents.

Thus, a natural question emerged in the social sciences: how can we use AI assistants to facilitate data collection? We explored this question by designing and deploying a conversational agent to conduct a study with over 399 participants. Our findings suggest that there are numerous benefits to employing conversational agents over traditional surveys. Interestingly, these tools still fall short when compared to in-person interviews. What was most surprising, however, is that managing participants’ expectations is a key methodology element.

Key Insights

The limitations of surveys, interviews, and chatbots

Surveys have become a popular tool for data collection, but their prevalence is not due to their ability to extract superior insights. Rather, the ease of data analysis after collection is what turned them into the de facto tools of human data collection. However, surveys are not great exploratory tools. Composing questions, especially close-ended ones, is tough when we don’t yet know what to ask. This is where methods like interviews or focus groups excel because they allow interviewers to probe participants (i.e., ask subsequent questions) on areas they think are interesting. These methods are flexible yet substantially hard to scale. With great effort, an interviewer can talk to 8 participants a day. But the brunt of the work is in the analysis, where each hour of interview may amount to more than 3 hours of analysis.

The gap between both methods is—and has always been—quite broad. For a long time, researchers have sought methods that offer more flexibility, either by allowing us to scale interviews or by enhancing the richness of surveys. For quite some time, chatbots were seen as an option to bridge the gap. Nonetheless, the promise of chatbots was received with great disappointment. Whether it was the Q&A system for an airline or a robot receptionist answering the support line of a bank, most of us have experienced the distaste of interacting with these alleged conversational agents. Despite the significant advances in machine learning, specifically neural networks, a pleasant interaction with a chatbot seemed elusive.

Then came the LLMs with skyrocketing popularity thanks to OpenAI’s ChatGPT. This easy-to-use, know-it-all conversational agent captivated the world and reinstated confidence in chatbots. Not only are they easy to use, but recently, they have become even easier to deploy. With the new functionalities announced by OpenAI, users can now set up their custom chatbots with few lines of code. Yes, this means you can deploy a chatbot that talks like you or about niche things that you care about; the sky (or compute power) is the limit.

And yes, this also means we can use chatbots to create surveys and interview questions or have them conduct surveys or conduct interviews. But can it really do these tasks well? This is what we set forth to study. To do this, we designed three chatbots and recruited 399 participants to participate in a study about AI alignment. We split participants into three groups, each interacting with a different chatbot, and asked them to complete a survey about their experience.

Our study approach

Our study had three stages. First, participants took a multiple-choice survey on AI alignment. The purpose of the survey was twofold. First, it was a way to prime participants about the topic at hand. Second, it provided us with a benchmark by which we could assess our interpretations of the conversations. More on this later. After the first survey, participants were split randomly into three groups, each with a different chatbot design. The chatbots were programmed to ask questions about AI alignment. Our baseline was a chatbot that only asked hardcoded questions, whereas the two other chatbots relied on LLMs to display more intelligence. Lastly, participants were asked to complete an exit survey on their experience.

We found significant evidence to suggest that researchers have much to gain from employing chatbots to replace surveys. Interestingly, this was not the case when comparing chatbots to interviews. Compared to surveys, participants had better engagement and rated their experience significantly higher. On the other hand, when comparing the chatbot to in-person interviews, participants found themselves preferring a human-to-human interview.

Among the most interesting key insights was an accidental discovery in our methodology. In the survey that followed the chatbot interaction, we referred to the chatbot as an “AI interviewer.” This framing was particularly salient to participants in the baseline group, who interacted with the baseline chatbot. Several of the participants expressed frustration and disappointment at the fact that the chatbot did not seem intelligent. On the other hand, this effect was absent from the other groups. Rather, participants expressed that they much preferred their interaction with the chatbot over traditional surveys.

Between the lines

The ease of deployment and enjoyment of participants bodes well for using chatbots as data collection instruments in user studies. Soon, however, we should consider them as survey augmenters rather than replacements for in-person interviews. Furthermore, the missing puzzle piece from our work is scaling the analyses of the collected data. Recent work has shown that LLMs may also assist in analyzing the chatlogs. Although outside the scope of this paper, we found encouraging preliminary results when analyzing the collected data with ChatGPT.

With careful management of user expectations, we could introduce a new tool for user studies: a tool that allows us to explore new phenomena at a greater scale more quickly and deeply.

Want quick summaries of the latest research & reporting in AI ethics delivered to your inbox? Subscribe to the AI Ethics Brief. We publish bi-weekly.

Primary Sidebar

🔍 SEARCH

Spotlight

A brightly coloured illustration which can be viewed in any direction. It has many elements to it working together: men in suits around a table, someone in a data centre, big hands controlling the scenes and holding a phone, people in a production line. Motifs such as network diagrams and melting emojis are placed throughout the busy vignettes.

Tech Futures: The Fossil Fuels Playbook for Big Tech: Part II

A rock embedded with intricate circuit board patterns, held delicately by pale hands drawn in a ghostly style. The contrast between the rough, metallic mineral and the sleek, artificial circuit board illustrates the relationship between raw natural resources and modern technological development. The hands evoke human involvement in the extraction and manufacturing processes.

Tech Futures: The Fossil Fuels Playbook for Big Tech: Part I

Close-up of a cat sleeping on a computer keyboard

Tech Futures: The threat of AI-generated code to the world’s digital infrastructure

The undying sun hangs in the sky, as people gather around signal towers, working through their digital devices.

Dreams and Realities in Modi’s AI Impact Summit

Illustration of a coral reef ecosystem

Tech Futures: Diversity of Thought and Experience: The UN’s Scientific Panel on AI

related posts

  • Anthropomorphized AI as Capitalist Agents: The Price We Pay for Familiarity

    Anthropomorphized AI as Capitalist Agents: The Price We Pay for Familiarity

  • Understanding technology-induced value change: a pragmatist proposal

    Understanding technology-induced value change: a pragmatist proposal

  • When Algorithms Infer Pregnancy or Other Sensitive Information About People

    When Algorithms Infer Pregnancy or Other Sensitive Information About People

  • Enough With “Human-AI Collaboration”

    Enough With “Human-AI Collaboration”

  • The Brussels Effect and AI: How EU Regulation will Impact the Global AI Market

    The Brussels Effect and AI: How EU Regulation will Impact the Global AI Market

  • Faith and Fate: Limits of Transformers on Compositionality

    Faith and Fate: Limits of Transformers on Compositionality

  • Research summary: Troubling Trends in Machine Learning Scholarship

    Research summary: Troubling Trends in Machine Learning Scholarship

  • Artificial Intelligence: the global landscape of ethics guidelines

    Artificial Intelligence: the global landscape of ethics guidelines

  • The Evolution of War: How AI has Changed Military Weaponry and Technology

    The Evolution of War: How AI has Changed Military Weaponry and Technology

  • An Empirical Study of Modular Bias Mitigators and Ensembles

    An Empirical Study of Modular Bias Mitigators and Ensembles

Partners

  •  
    U.S. Artificial Intelligence Safety Institute Consortium (AISIC) at NIST

  • Partnership on AI

  • The LF AI & Data Foundation

  • The AI Alliance

Footer


Articles

Columns

AI Literacy

The State of AI Ethics Report


 

About Us


Founded in 2018, the Montreal AI Ethics Institute (MAIEI) is an international non-profit organization equipping citizens concerned about artificial intelligence and its impact on society to take action.

Contact

Donate


  • © 2025 MONTREAL AI ETHICS INSTITUTE.
  • This work is licensed under a Creative Commons Attribution 4.0 International License.
  • Learn more about our open access policy here.
  • Creative Commons License

    Save hours of work and stay on top of Responsible AI research and reporting with our bi-weekly email newsletter.