Comparing Large Language Model AI and Human-Generated Coaching Messages
for Behavioral Weight Loss
- URL: http://arxiv.org/abs/2312.04059v1
- Date: Thu, 7 Dec 2023 05:45:24 GMT
- Title: Comparing Large Language Model AI and Human-Generated Coaching Messages
for Behavioral Weight Loss
- Authors: Zhuoran Huang, Michael P. Berry, Christina Chwyl, Gary Hsieh, Jing
Wei, Evan M. Forman
- Abstract summary: Large language model (LLM) based artificial intelligence (AI) chatbots could offer more personalized and novel messages.
87 adults in a weight-loss trial rated ten coaching messages' helpfulness using a 5-point Likert scale.
- Score: 5.824523259910306
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated coaching messages for weight control can save time and costs, but
their repetitive, generic nature may limit their effectiveness compared to
human coaching. Large language model (LLM) based artificial intelligence (AI)
chatbots, like ChatGPT, could offer more personalized and novel messages to
address repetition with their data-processing abilities. While LLM AI
demonstrates promise to encourage healthier lifestyles, studies have yet to
examine the feasibility and acceptability of LLM-based BWL coaching. 87 adults
in a weight-loss trial rated ten coaching messages' helpfulness (five
human-written, five ChatGPT-generated) using a 5-point Likert scale, providing
additional open-ended feedback to justify their ratings. Participants also
identified which messages they believed were AI-generated. The evaluation
occurred in two phases: messages in Phase 1 were perceived as impersonal and
negative, prompting revisions for Phase 2 messages. In Phase 1, AI-generated
messages were rated less helpful than human-written ones, with 66 percent
receiving a helpfulness rating of 3 or higher. However, in Phase 2, the AI
messages matched the human-written ones regarding helpfulness, with 82% scoring
three or above. Additionally, 50% were misidentified as human-written,
suggesting AI's sophistication in mimicking human-generated content. A thematic
analysis of open-ended feedback revealed that participants appreciated AI's
empathy and personalized suggestions but found them more formulaic, less
authentic, and too data-focused. This study reveals the preliminary feasibility
and acceptability of LLM AIs, like ChatGPT, in crafting potentially effective
weight control coaching messages. Our findings also underscore areas for future
enhancement.
Related papers
- Human Bias in the Face of AI: The Role of Human Judgement in AI Generated Text Evaluation [48.70176791365903]
This study explores how bias shapes the perception of AI versus human generated content.
We investigated how human raters respond to labeled and unlabeled content.
arXiv Detail & Related papers (2024-09-29T04:31:45Z) - How Well Can LLMs Echo Us? Evaluating AI Chatbots' Role-Play Ability with ECHO [55.25989137825992]
We introduce ECHO, an evaluative framework inspired by the Turing test.
This framework engages the acquaintances of the target individuals to distinguish between human and machine-generated responses.
We evaluate three role-playing LLMs using ECHO, with GPT-3.5 and GPT-4 serving as foundational models.
arXiv Detail & Related papers (2024-04-22T08:00:51Z) - On the Conversational Persuasiveness of Large Language Models: A Randomized Controlled Trial [10.770999939834985]
We analyze the effect of AI-driven persuasion in a controlled, harmless setting.
We found that participants who debated GPT-4 with access to their personal information had 81.7% higher odds of increased agreement with their opponents compared to participants who debated humans.
arXiv Detail & Related papers (2024-03-21T13:14:40Z) - Can ChatGPT Read Who You Are? [10.577227353680994]
We report the results of a comprehensive user study featuring texts written in Czech by a representative population sample of 155 participants.
We compare the personality trait estimations made by ChatGPT against those by human raters and report ChatGPT's competitive performance in inferring personality traits from text.
arXiv Detail & Related papers (2023-12-26T14:43:04Z) - The effect of source disclosure on evaluation of AI-generated messages:
A two-part study [0.0]
We examined the influence of source disclosure on people's evaluation of AI-generated health prevention messages.
We found that source disclosure significantly impacted the evaluation of the messages but did not significantly alter message rankings.
For those with moderate levels of negative attitudes towards AI, source disclosure decreased the preference for AI-generated messages.
arXiv Detail & Related papers (2023-11-27T05:20:47Z) - PsyCoT: Psychological Questionnaire as Powerful Chain-of-Thought for
Personality Detection [50.66968526809069]
We propose a novel personality detection method, called PsyCoT, which mimics the way individuals complete psychological questionnaires in a multi-turn dialogue manner.
Our experiments demonstrate that PsyCoT significantly improves the performance and robustness of GPT-3.5 in personality detection.
arXiv Detail & Related papers (2023-10-31T08:23:33Z) - Learning to Prompt in the Classroom to Understand AI Limits: A pilot
study [35.06607166918901]
Large Language Models (LLM) and the derived chatbots, like ChatGPT, have highly improved the natural language processing capabilities of AI systems.
However, excitement has led to negative sentiments, even as AI methods demonstrate remarkable contributions.
A pilot educational intervention was performed in a high school with 21 students.
arXiv Detail & Related papers (2023-07-04T07:51:37Z) - Principle-Driven Self-Alignment of Language Models from Scratch with
Minimal Human Supervision [84.31474052176343]
Recent AI-assistant agents, such as ChatGPT, rely on supervised fine-tuning (SFT) with human annotations and reinforcement learning from human feedback to align the output with human intentions.
This dependence can significantly constrain the true potential of AI-assistant agents due to the high cost of obtaining human supervision.
We propose a novel approach called SELF-ALIGN, which combines principle-driven reasoning and the generative power of LLMs for the self-alignment of AI agents with minimal human supervision.
arXiv Detail & Related papers (2023-05-04T17:59:28Z) - Can ChatGPT Assess Human Personalities? A General Evaluation Framework [70.90142717649785]
Large Language Models (LLMs) have produced impressive results in various areas, but their potential human-like psychology is still largely unexplored.
This paper presents a generic evaluation framework for LLMs to assess human personalities based on Myers Briggs Type Indicator (MBTI) tests.
arXiv Detail & Related papers (2023-03-01T06:16:14Z) - Artificial Intelligence for Health Message Generation: Theory, Method,
and an Empirical Study Using Prompt Engineering [0.0]
This study introduces and examines the potential of an AI system to generate health awareness messages.
The topic of folic acid, a vitamin that is critical during pregnancy, served as a test case.
We generated messages that could be used to raise awareness and compared them to retweeted human-generated messages.
arXiv Detail & Related papers (2022-12-14T21:13:08Z) - The Role of AI in Drug Discovery: Challenges, Opportunities, and
Strategies [97.5153823429076]
The benefits, challenges and drawbacks of AI in this field are reviewed.
The use of data augmentation, explainable AI, and the integration of AI with traditional experimental methods are also discussed.
arXiv Detail & Related papers (2022-12-08T23:23:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.