AI Telephone Surveying: Automating Quantitative Data Collection with an AI Interviewer
- URL: http://arxiv.org/abs/2507.17718v1
- Date: Wed, 23 Jul 2025 17:30:14 GMT
- Title: AI Telephone Surveying: Automating Quantitative Data Collection with an AI Interviewer
- Authors: Danny D. Leybzon, Shreyas Tirumala, Nishant Jain, Summer Gillen, Michael Jackson, Cameron McPhee, Jennifer Schmidt,
- Abstract summary: We built and tested an AI system to conduct quantitative surveys based on large language models (LLM), automatic speech recognition (ASR), and speech synthesis technologies.<n>The system was specifically designed for quantitative research, and strictly adhered to research best practices like question order randomization, answer order randomization, and exact wording.<n>Our results suggest that shorter instruments and more responsive AI interviewers may contribute to improvements across all three metrics studied.
- Score: 1.8929175690169533
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rise of voice-enabled artificial intelligence (AI) systems, quantitative survey researchers have access to a new data-collection mode: AI telephone surveying. By using AI to conduct phone interviews, researchers can scale quantitative studies while balancing the dual goals of human-like interactivity and methodological rigor. Unlike earlier efforts that used interactive voice response (IVR) technology to automate these surveys, voice AI enables a more natural and adaptive respondent experience as it is more robust to interruptions, corrections, and other idiosyncrasies of human speech. We built and tested an AI system to conduct quantitative surveys based on large language models (LLM), automatic speech recognition (ASR), and speech synthesis technologies. The system was specifically designed for quantitative research, and strictly adhered to research best practices like question order randomization, answer order randomization, and exact wording. To validate the system's effectiveness, we deployed it to conduct two pilot surveys with the SSRS Opinion Panel and followed-up with a separate human-administered survey to assess respondent experiences. We measured three key metrics: the survey completion rates, break-off rates, and respondent satisfaction scores. Our results suggest that shorter instruments and more responsive AI interviewers may contribute to improvements across all three metrics studied.
Related papers
- TestAgent: An Adaptive and Intelligent Expert for Human Assessment [62.060118490577366]
We propose TestAgent, a large language model (LLM)-powered agent designed to enhance adaptive testing through interactive engagement.<n>TestAgent supports personalized question selection, captures test-takers' responses and anomalies, and provides precise outcomes through dynamic, conversational interactions.
arXiv Detail & Related papers (2025-06-03T16:07:54Z) - AI-Assisted Conversational Interviewing: Effects on Data Quality and User Experience [0.0]
This study bridges the gap between standardized surveys and conversational interviews by introducing a framework for AI-assisted interviews.<n>We conducted a web survey experiment where 1,800 participants were randomly assigned to text-based conversational AI agents, or "textbots"<n>Our findings reveal the feasibility of using AI methods to enhance open-ended data collection in web surveys.
arXiv Detail & Related papers (2025-04-09T13:58:07Z) - Telephone Surveys Meet Conversational AI: Evaluating a LLM-Based Telephone Survey System at Scale [0.0]
This presents an AI-driven telephone survey system integrating text-to-speech (TTS), a large language model (LLM), and speech-to-speech (STT)<n>We tested the system across two populations, a pilot study in the United States (n = 75) and a large-scale deployment in Peru (n = 2,739)<n>Our findings demonstrate that while the AI system's probing for qualitative depth was more limited than human interviewers, overall data quality approached human-led standards for structured items.
arXiv Detail & Related papers (2025-02-27T14:31:42Z) - Where are we in audio deepfake detection? A systematic analysis over generative and detection models [59.09338266364506]
SONAR is a synthetic AI-Audio Detection Framework and Benchmark.<n>It provides a comprehensive evaluation for distinguishing cutting-edge AI-synthesized auditory content.<n>It is the first framework to uniformly benchmark AI-audio detection across both traditional and foundation model-based detection systems.
arXiv Detail & Related papers (2024-10-06T01:03:42Z) - AI Conversational Interviewing: Transforming Surveys with LLMs as Adaptive Interviewers [40.80290002598963]
This study explores the potential of replacing human interviewers with large language models (LLMs) to conduct scalable conversational interviews.<n>We conducted a small-scale, in-depth study with university students who were randomly assigned to a conversational interview by either AI or human interviewers.<n>Various quantitative and qualitative measures assessed interviewer adherence to guidelines, response quality, participant engagement, and overall interview efficacy.
arXiv Detail & Related papers (2024-09-16T16:03:08Z) - Can AI Serve as a Substitute for Human Subjects in Software Engineering
Research? [24.39463126056733]
This vision paper proposes a novel approach to qualitative data collection in software engineering research by harnessing the capabilities of artificial intelligence (AI)
We explore the potential of AI-generated synthetic text as an alternative source of qualitative data.
We discuss the prospective development of new foundation models aimed at emulating human behavior in observational studies and user evaluations.
arXiv Detail & Related papers (2023-11-18T14:05:52Z) - On the Possibilities of AI-Generated Text Detection [76.55825911221434]
We argue that as machine-generated text approximates human-like quality, the sample size needed for detection bounds increases.
We test various state-of-the-art text generators, including GPT-2, GPT-3.5-Turbo, Llama, Llama-2-13B-Chat-HF, and Llama-2-70B-Chat-HF, against detectors, including oBERTa-Large/Base-Detector, GPTZero.
arXiv Detail & Related papers (2023-04-10T17:47:39Z) - Connecting Humanities and Social Sciences: Applying Language and Speech
Technology to Online Panel Surveys [2.0646127669654835]
We explore the application of language and speech technology to open-ended questions in a Dutch panel survey.
In an experimental wave respondents could choose to answer open questions via speech or keyboard.
We report the errors the ASR system produces and investigate the impact of these errors on downstream analyses.
arXiv Detail & Related papers (2023-02-21T10:52:15Z) - End-to-end Spoken Conversational Question Answering: Task, Dataset and
Model [92.18621726802726]
In spoken question answering, the systems are designed to answer questions from contiguous text spans within the related speech transcripts.
We propose a new Spoken Conversational Question Answering task (SCQA), aiming at enabling the systems to model complex dialogue flows.
Our main objective is to build the system to deal with conversational questions based on the audio recordings, and to explore the plausibility of providing more cues from different modalities with systems in information gathering.
arXiv Detail & Related papers (2022-04-29T17:56:59Z) - Partner Matters! An Empirical Study on Fusing Personas for Personalized
Response Selection in Retrieval-Based Chatbots [51.091235903442715]
This paper makes an attempt to explore the impact of utilizing personas that describe either self or partner speakers on the task of response selection.
Four persona fusion strategies are designed, which assume personas interact with contexts or responses in different ways.
Empirical studies on the Persona-Chat dataset show that the partner personas can improve the accuracy of response selection.
arXiv Detail & Related papers (2021-05-19T10:32:30Z) - Towards Data Distillation for End-to-end Spoken Conversational Question
Answering [65.124088336738]
We propose a new Spoken Conversational Question Answering task (SCQA)
SCQA aims at enabling QA systems to model complex dialogues flow given the speech utterances and text corpora.
Our main objective is to build a QA system to deal with conversational questions both in spoken and text forms.
arXiv Detail & Related papers (2020-10-18T05:53:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.