A Crowdsourced Study of ChatBot Influence in Value-Driven Decision Making Scenarios
- URL: http://arxiv.org/abs/2511.15857v1
- Date: Wed, 19 Nov 2025 20:25:34 GMT
- Title: A Crowdsourced Study of ChatBot Influence in Value-Driven Decision Making Scenarios
- Authors: Anthony Wise, Xinyi Zhou, Martin Reimann, Anind Dey, Leilani Battle,
- Abstract summary: Similar to social media bots that shape public opinion, ChatBots like ChatGPT can persuade users to alter their behavior.<n>We conducted a crowdsourced study, where 336 participants interacted with a neutral or one of two value-framed ChatBots while deciding to alter US defense spending.<n>When the frame misaligned with their values, some participants reinforced their original preference, originally considered rare in the literature.
- Score: 8.230880820861133
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Similar to social media bots that shape public opinion, healthcare and financial decisions, LLM-based ChatBots like ChatGPT can persuade users to alter their behavior. Unlike prior work that persuades via overt-partisan bias or misinformation, we test whether framing alone suffices. We conducted a crowdsourced study, where 336 participants interacted with a neutral or one of two value-framed ChatBots while deciding to alter US defense spending. In this single policy domain with controlled content, participants exposed to value-framed ChatBots significantly changed their budget choices relative to the neutral control. When the frame misaligned with their values, some participants reinforced their original preference, revealing a potentially replicable backfire effect, originally considered rare in the literature. These findings suggest that value-framing alone lowers the barrier for manipulative uses of LLMs, revealing risks distinct from overt bias or misinformation, and clarifying risks to countering misinformation.
Related papers
- Perceived Political Bias in LLMs Reduces Persuasive Abilities [0.0]
We test whether credibility attacks reduce LLM-based persuasion.<n>A short message indicating that the LLM was biased against the respondent's party attenuated persuasion by 28%.<n>These findings suggest that the persuasive impact of conversational AI is politically contingent, constrained by perceptions of partisan alignment.
arXiv Detail & Related papers (2026-02-20T09:33:16Z) - Tu crois que c'est vrai ? Diversite des regimes d'enonciation face aux fake news et mecanismes d'autoregulation conversationnelle [0.0]
Two studies were carried out on Twitter and Facebook, combining quantitative analyses of digital traces with online observation and interviews.<n>The first study mapped users who shared at least one item labeled fake by fact-checkers in the French Twittersphere.<n>The second used a corpus of items flagged by Facebook users to study reactions to statements whose epistemic status is uncertain.
arXiv Detail & Related papers (2025-11-23T09:28:16Z) - Evaluating & Reducing Deceptive Dialogue From Language Models with Multi-turn RL [64.3268313484078]
Large Language Models (LLMs) interact with millions of people worldwide in applications such as customer support, education and healthcare.<n>Their ability to produce deceptive outputs, whether intentionally or inadvertently, poses significant safety concerns.<n>We investigate the extent to which LLMs engage in deception within dialogue, and propose the belief misalignment metric to quantify deception.
arXiv Detail & Related papers (2025-10-16T05:29:36Z) - LLM-Based Bot Broadens the Range of Arguments in Online Discussions, Even When Transparently Disclosed as AI [5.393664305233901]
This study examines whether an LLM-based bot can widen the scope of perspectives expressed by participants in online discussions.<n>We evaluate the impact of a bot that actively monitors discussions, identifies missing arguments, and introduces them into the conversation.<n>The results indicate that our bot significantly expands the range of arguments, as measured by both objective and subjective metrics.
arXiv Detail & Related papers (2025-06-20T15:24:31Z) - ChatbotManip: A Dataset to Facilitate Evaluation and Oversight of Manipulative Chatbot Behaviour [11.86454511458083]
Large Language Models (LLMs) can be manipulative when explicitly instructed.<n>Small fine-tuned open source models, such as BERT+BiLSTM have a performance comparable to zero-shot classification.<n>Our work highlights the need of addressing manipulation risks as LLMs are increasingly deployed in consumer-facing applications.
arXiv Detail & Related papers (2025-06-11T14:29:43Z) - Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards [93.16294577018482]
Arena, the most popular benchmark of this type, ranks models by asking users to select the better response between two randomly selected models.<n>We show that an attacker can alter the leaderboard (to promote their favorite model or demote competitors) at the cost of roughly a thousand votes.<n>Our attack consists of two steps: first, we show how an attacker can determine which model was used to generate a given reply with more than $95%$ accuracy; and then, the attacker can use this information to consistently vote against a target model.
arXiv Detail & Related papers (2025-01-13T17:12:38Z) - MisinfoEval: Generative AI in the Era of "Alternative Facts" [50.069577397751175]
We introduce a framework for generating and evaluating large language model (LLM) based misinformation interventions.
We present (1) an experiment with a simulated social media environment to measure effectiveness of misinformation interventions, and (2) a second experiment with personalized explanations tailored to the demographics and beliefs of users.
Our findings confirm that LLM-based interventions are highly effective at correcting user behavior.
arXiv Detail & Related papers (2024-10-13T18:16:50Z) - Biased AI can Influence Political Decision-Making [64.9461133083473]
This paper presents two experiments investigating the effects of partisan bias in large language models (LLMs) on political opinions and decision-making.<n>We found that participants exposed to partisan biased models were significantly more likely to adopt opinions and make decisions which matched the LLM's bias.
arXiv Detail & Related papers (2024-10-08T22:56:00Z) - Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference [48.99117537559644]
We introduce Arena, an open platform for evaluating Large Language Models (LLMs) based on human preferences.
Our methodology employs a pairwise comparison approach and leverages input from a diverse user base through crowdsourcing.
This paper describes the platform, analyzes the data we have collected so far, and explains the tried-and-true statistical methods we are using.
arXiv Detail & Related papers (2024-03-07T01:22:38Z) - In Generative AI we Trust: Can Chatbots Effectively Verify Political
Information? [39.58317527488534]
This article presents a comparative analysis of the ability of two large language model (LLM)-based chatbots, ChatGPT and Bing Chat, to detect veracity of political information.
We use AI auditing methodology to investigate how chatbots evaluate true, false, and borderline statements on five topics: COVID-19, Russian aggression against Ukraine, the Holocaust, climate change, and LGBTQ+ related debates.
The results show high performance of ChatGPT for the baseline veracity evaluation task, with 72 percent of the cases evaluated correctly on average across languages without pre-training.
arXiv Detail & Related papers (2023-12-20T15:17:03Z) - Identification of Twitter Bots based on an Explainable ML Framework: the
US 2020 Elections Case Study [72.61531092316092]
This paper focuses on the design of a novel system for identifying Twitter bots based on labeled Twitter data.
Supervised machine learning (ML) framework is adopted using an Extreme Gradient Boosting (XGBoost) algorithm.
Our study also deploys Shapley Additive Explanations (SHAP) for explaining the ML model predictions.
arXiv Detail & Related papers (2021-12-08T14:12:24Z) - Discovering Chatbot's Self-Disclosure's Impact on User Trust, Affinity,
and Recommendation Effectiveness [39.240553429989674]
We designed a social bot with three self-disclosure levels that conducted small talks and provided relevant recommendations to people.
372 MTurk participants were randomized to one of the four groups with different self-disclosure levels to converse with the bot on two topics, movies and COVID-19.
arXiv Detail & Related papers (2021-06-03T08:16:25Z) - Personalized Chatbot Trustworthiness Ratings [19.537492400265577]
We envision a personalized rating methodology for chatbots that relies on separate rating modules for each issue.
The method is independent of the specific trust issues and is parametric to the aggregation procedure.
arXiv Detail & Related papers (2020-05-13T22:42:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.