Effect of Static vs. Conversational AI-Generated Messages on Colorectal Cancer Screening Intent: a Randomized Controlled Trial
- URL: http://arxiv.org/abs/2507.08211v1
- Date: Thu, 10 Jul 2025 22:46:43 GMT
- Title: Effect of Static vs. Conversational AI-Generated Messages on Colorectal Cancer Screening Intent: a Randomized Controlled Trial
- Authors: Neil K. R. Sehgal, Manuel Tonneau, Andy Tan, Shivan J. Mehta, Alison Buttenheim, Lyle Ungar, Anish K. Agarwal, Sharath Chandra Guntuku,
- Abstract summary: Large language model (LLM) chatbots show increasing promise in persuasive communication.<n>We enrolled 915 U.S. adults (ages 45-75) who had never completed colorectal cancer (CRC) screening.<n>Both AI interventions significantly increased stool test intentions by over 12 points (12.9-13.8/100), compared to a 7.5 gain for expert materials.
- Score: 5.429833789548265
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language model (LLM) chatbots show increasing promise in persuasive communication. Yet their real-world utility remains uncertain, particularly in clinical settings where sustained conversations are difficult to scale. In a pre-registered randomized controlled trial, we enrolled 915 U.S. adults (ages 45-75) who had never completed colorectal cancer (CRC) screening. Participants were randomized to: (1) no message control, (2) expert-written patient materials, (3) single AI-generated message, or (4) a motivational interviewing chatbot. All participants were required to remain in their assigned condition for at least three minutes. Both AI arms tailored content using participant's self-reported demographics including age and gender. Both AI interventions significantly increased stool test intentions by over 12 points (12.9-13.8/100), compared to a 7.5 gain for expert materials (p<.001 for all comparisons). While the AI arms outperformed the no message control for colonoscopy intent, neither showed improvement xover expert materials. Notably, for both outcomes, the chatbot did not outperform the single AI message in boosting intent despite participants spending ~3.5 minutes more on average engaging with it. These findings suggest concise, demographically tailored AI messages may offer a more scalable and clinically viable path to health behavior change than more complex conversational agents and generic time intensive expert-written materials. Moreover, LLMs appear more persuasive for lesser-known and less-invasive screening approaches like stool testing, but may be less effective for entrenched preferences like colonoscopy. Future work should examine which facets of personalization drive behavior change, whether integrating structural supports can translate these modest intent gains into completed screenings, and which health behaviors are most responsive to AI-supported guidance.
Related papers
- Conversations with AI Chatbots Increase Short-Term Vaccine Intentions But Do Not Outperform Standard Public Health Messaging [5.816741004594914]
Large language model (LLM) based chatbots show promise in persuasive communication.<n>This randomized controlled trial involved 930 vaccine-hesitant parents.<n>Participants were randomly assigned to: (1) a weak control (no message), (2) a strong control reflecting the standard of care (reading official public health materials), or (3 and 4) one of two conditions.
arXiv Detail & Related papers (2025-04-29T07:59:46Z) - AI persuading AI vs AI persuading Humans: LLMs' Differential Effectiveness in Promoting Pro-Environmental Behavior [70.24245082578167]
Pro-environmental behavior (PEB) is vital to combat climate change, yet turning awareness into intention and action remains elusive.<n>We explore large language models (LLMs) as tools to promote PEB, comparing their impact across 3,200 participants.<n>Results reveal a "synthetic persuasion paradox": synthetic and simulated agents significantly affect their post-intervention PEB stance, while human responses barely shift.
arXiv Detail & Related papers (2025-03-03T21:40:55Z) - Almost AI, Almost Human: The Challenge of Detecting AI-Polished Writing [55.2480439325792]
This study systematically evaluations twelve state-of-the-art AI-text detectors using our AI-Polished-Text Evaluation dataset.<n>Our findings reveal that detectors frequently flag even minimally polished text as AI-generated, struggle to differentiate between degrees of AI involvement, and exhibit biases against older and smaller models.
arXiv Detail & Related papers (2025-02-21T18:45:37Z) - Human Bias in the Face of AI: Examining Human Judgment Against Text Labeled as AI Generated [48.70176791365903]
This study explores how bias shapes the perception of AI versus human generated content.<n>We investigated how human raters respond to labeled and unlabeled content.
arXiv Detail & Related papers (2024-09-29T04:31:45Z) - X-TURING: Towards an Enhanced and Efficient Turing Test for Long-Term Dialogue Agents [56.64615470513102]
The Turing test examines whether AIs exhibit human-like behaviour in natural language conversations.<n>Traditional setting limits each participant to one message at a time and requires constant human participation.<n>This paper proposes textbftextscX-Turing, which enhances the original test with a textitburst dialogue pattern.
arXiv Detail & Related papers (2024-08-19T09:57:28Z) - How Reliable AI Chatbots are for Disease Prediction from Patient Complaints? [0.0]
This study examines the reliability of AI chatbots, specifically GPT 4.0, Claude 3 Opus, and Gemini Ultra 1.0, in predicting diseases from patient complaints in the emergency department.
Results suggest that GPT 4.0 achieves high accuracy with increased few-shot data, while Gemini Ultra 1.0 performs well with fewer examples, and Claude 3 Opus maintains consistent performance.
arXiv Detail & Related papers (2024-05-21T22:00:13Z) - A General-purpose AI Avatar in Healthcare [1.5081825869395544]
This paper focuses on the role of chatbots in healthcare and explores the use of avatars to make AI interactions more appealing to patients.
A framework of a general-purpose AI avatar application is demonstrated by using a three-category prompt dictionary and prompt improvement mechanism.
A two-phase approach is suggested to fine-tune a general-purpose AI language model and create different AI avatars to discuss medical issues with users.
arXiv Detail & Related papers (2024-01-10T03:44:15Z) - Comparing Large Language Model AI and Human-Generated Coaching Messages for Behavioral Weight Loss [5.496825493463708]
Large language model (LLM) based artificial intelligence (AI) chatbots could offer more personalized and novel messages.<n> 87 adults in a weight-loss trial rated ten coaching messages' helpfulness using a 5-point Likert scale.
arXiv Detail & Related papers (2023-12-07T05:45:24Z) - The impact of responding to patient messages with large language model
assistance [4.243020918808522]
Documentation burden is a major contributor to clinician burnout.
Many hospitals are actively integrating such systems into electronic medical record systems.
We are the first to examine the utility of large language models in assisting clinicians draft responses to patient questions.
arXiv Detail & Related papers (2023-10-26T18:03:46Z) - The Role of AI in Drug Discovery: Challenges, Opportunities, and
Strategies [97.5153823429076]
The benefits, challenges and drawbacks of AI in this field are reviewed.
The use of data augmentation, explainable AI, and the integration of AI with traditional experimental methods are also discussed.
arXiv Detail & Related papers (2022-12-08T23:23:39Z) - Semi-Supervised Variational Reasoning for Medical Dialogue Generation [70.838542865384]
Two key characteristics are relevant for medical dialogue generation: patient states and physician actions.
We propose an end-to-end variational reasoning approach to medical dialogue generation.
A physician policy network composed of an action-classifier and two reasoning detectors is proposed for augmented reasoning ability.
arXiv Detail & Related papers (2021-05-13T04:14:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.