Argumentative Experience: Reducing Confirmation Bias on Controversial Issues through LLM-Generated Multi-Persona Debates
- URL: http://arxiv.org/abs/2412.04629v3
- Date: Thu, 17 Apr 2025 21:33:22 GMT
- Title: Argumentative Experience: Reducing Confirmation Bias on Controversial Issues through LLM-Generated Multi-Persona Debates
- Authors: Li Shi, Houjiang Liu, Yian Wong, Utkarsh Mujumdar, Dan Zhang, Jacek Gwizdka, Matthew Lease,
- Abstract summary: Large language models (LLMs) are enabling designers to give life to exciting new user experiences for information access.<n>Our study exposes participants to multiple viewpoints on controversial issues via a mixed-methods, within-subjects study.<n>Compared to a baseline search system, we see more creative interactions and diverse information-seeking with our multi-persona debate system.
- Score: 7.4355162723392585
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large language models (LLMs) are enabling designers to give life to exciting new user experiences for information access. In this work, we present a system that generates LLM personas to debate a topic of interest from different perspectives. How might information seekers use and benefit from such a system? Can centering information access around diverse viewpoints help to mitigate thorny challenges like confirmation bias in which information seekers over-trust search results matching existing beliefs? How do potential biases and hallucinations in LLMs play out alongside human users who are also fallible and possibly biased? Our study exposes participants to multiple viewpoints on controversial issues via a mixed-methods, within-subjects study. We use eye-tracking metrics to quantitatively assess cognitive engagement alongside qualitative feedback. Compared to a baseline search system, we see more creative interactions and diverse information-seeking with our multi-persona debate system, which more effectively reduces user confirmation bias and conviction toward their initial beliefs. Overall, our study contributes to the emerging design space of LLM-based information access systems, specifically investigating the potential of simulated personas to promote greater exposure to information diversity, emulate collective intelligence, and mitigate bias in information seeking.
Related papers
- Single LLM Debate, MoLaCE: Mixture of Latent Concept Experts Against Confirmation Bias [24.182306712604966]
Large language models (LLMs) are highly vulnerable to input confirmation bias.<n>MoLaCE is a lightweight inference-time framework that addresses confirmation bias by mixing experts instantiated as different activation strengths.<n>We empirically show that it consistently reduces confirmation bias, improves robustness, and surpasses multi-agent debate.
arXiv Detail & Related papers (2025-12-29T14:52:34Z) - MMPersuade: A Dataset and Evaluation Framework for Multimodal Persuasion [73.99171322670772]
Large Vision-Language Models (LVLMs) are increasingly deployed in domains such as shopping, health, and news.<n> MMPersuade provides a unified framework for systematically studying multimodal persuasion dynamics in LVLMs.
arXiv Detail & Related papers (2025-10-26T17:39:21Z) - The Hunger Game Debate: On the Emergence of Over-Competition in Multi-Agent Systems [90.96738882568224]
This paper investigates the over-competition in multi-agent debate, where agents under extreme pressure exhibit unreliable, harmful behaviors.<n>To study this phenomenon, we propose HATE, a novel experimental framework that simulates debates under a zero-sum competition arena.
arXiv Detail & Related papers (2025-09-30T11:44:47Z) - Enhancing Multi-Agent Debate System Performance via Confidence Expression [55.34012400580016]
Multi-Agent Debate (MAD) systems simulate human debate and thereby improve task performance.<n>Some Large Language Models (LLMs) possess superior knowledge or reasoning capabilities for specific tasks, but struggle to clearly communicate this advantage during debates.<n>Inappropriate confidence expression can cause agents in MAD systems to either stubbornly maintain incorrect beliefs or converge prematurely on suboptimal answers.<n>We develop ConfMAD, a MAD framework that integrates confidence expression throughout the debate process.
arXiv Detail & Related papers (2025-09-17T14:34:27Z) - MV-Debate: Multi-view Agent Debate with Dynamic Reflection Gating for Multimodal Harmful Content Detection in Social Media [26.07883439550861]
MV-Debate is a multi-view agent debate framework with dynamic reflection gating for unified multimodal harmful content detection.<n>MV-Debate assembles four complementary debate agents, a surface analyst, a deep reasoner, a modality contrast, and a social contextualist, to analyze content from diverse interpretive perspectives.
arXiv Detail & Related papers (2025-08-07T16:38:25Z) - Debating for Better Reasoning: An Unsupervised Multimodal Approach [56.74157117060815]
We extend the debate paradigm to a multimodal setting, exploring its potential for weaker models to supervise and enhance the performance of stronger models.<n>We focus on visual question answering (VQA), where two "sighted" expert vision-language models debate an answer, while a "blind" (text-only) judge adjudicates based solely on the quality of the arguments.<n>In our framework, the experts defend only answers aligned with their beliefs, thereby obviating the need for explicit role-playing and concentrating the debate on instances of expert disagreement.
arXiv Detail & Related papers (2025-05-20T17:18:17Z) - Combating Multimodal LLM Hallucination via Bottom-Up Holistic Reasoning [151.4060202671114]
multimodal large language models (MLLMs) have shown unprecedented capabilities in advancing vision-language tasks.
This paper introduces a novel bottom-up reasoning framework to address hallucinations in MLLMs.
Our framework systematically addresses potential issues in both visual and textual inputs by verifying and integrating perception-level information with cognition-level commonsense knowledge.
arXiv Detail & Related papers (2024-12-15T09:10:46Z) - Can Users Detect Biases or Factual Errors in Generated Responses in Conversational Information-Seeking? [13.790574266700006]
We investigate the limitations of response generation in conversational information-seeking systems.
The study addresses the problem of query answerability and the challenge of response incompleteness.
Our analysis reveals that it is easier for users to detect response incompleteness than query answerability.
arXiv Detail & Related papers (2024-10-28T20:55:00Z) - Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs [50.40165119718928]
LongPiBench is a benchmark designed to assess positional bias involving multiple pieces of relevant information.
These experiments reveal that while most current models are robust against the "lost in the middle" issue, there exist significant biases related to the spacing of relevant information pieces.
arXiv Detail & Related papers (2024-10-18T17:41:19Z) - Bias in the Mirror: Are LLMs opinions robust to their own adversarial attacks ? [22.0383367888756]
Large language models (LLMs) inherit biases from their training data and alignment processes, influencing their responses in subtle ways.
We introduce a novel approach where two instances of an LLM engage in self-debate, arguing opposing viewpoints to persuade a neutral version of the model.
We evaluate how firmly biases hold and whether models are susceptible to reinforcing misinformation or shifting to harmful viewpoints.
arXiv Detail & Related papers (2024-10-17T13:06:02Z) - Cognitive Biases in Large Language Models for News Recommendation [68.90354828533535]
This paper explores the potential impact of cognitive biases on large language models (LLMs) based news recommender systems.
We discuss strategies to mitigate these biases through data augmentation, prompt engineering and learning algorithms aspects.
arXiv Detail & Related papers (2024-10-03T18:42:07Z) - Towards Detecting and Mitigating Cognitive Bias in Spoken Conversational Search [14.916529791823868]
This paper draws upon insights from information seeking, psychology, cognitive science, and wearable sensors to provoke novel conversations in the community.
We propose a framework including multimodal instruments and methods for experimental designs and settings.
arXiv Detail & Related papers (2024-05-21T03:50:32Z) - Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs [57.16442740983528]
In ad-hoc retrieval, evaluation relies heavily on user actions, including implicit feedback.
The role of user feedback in annotators' assessment of turns in a conversational perception has been little studied.
We focus on how the evaluation of task-oriented dialogue systems ( TDSs) is affected by considering user feedback, explicit or implicit, as provided through the follow-up utterance of a turn being evaluated.
arXiv Detail & Related papers (2024-04-19T16:45:50Z) - Cognitive Bias in Decision-Making with LLMs [19.87475562475802]
Large language models (LLMs) offer significant potential as tools to support an expanding range of decision-making tasks.
LLMs have been shown to inherit societal biases against protected groups, as well as be subject to bias functionally resembling cognitive bias.
Our work introduces BiasBuster, a framework designed to uncover, evaluate, and mitigate cognitive bias in LLMs.
arXiv Detail & Related papers (2024-02-25T02:35:56Z) - LEMMA: Towards LVLM-Enhanced Multimodal Misinformation Detection with External Knowledge Augmentation [58.524237916836164]
We propose LEMMA: LVLM-Enhanced Multimodal Misinformation Detection with External Knowledge Augmentation.
Our method improves the accuracy over the top baseline LVLM by 7% and 13% on Twitter and Fakeddit datasets respectively.
arXiv Detail & Related papers (2024-02-19T08:32:27Z) - Generative Echo Chamber? Effects of LLM-Powered Search Systems on
Diverse Information Seeking [49.02867094432589]
Large language models (LLMs) powered conversational search systems have already been used by hundreds of millions of people.
We investigate whether and how LLMs with opinion biases that either reinforce or challenge the user's view change the effect.
arXiv Detail & Related papers (2024-02-08T18:14:33Z) - Fostering User Engagement in the Critical Reflection of Arguments [3.26297440422721]
We propose a system that engages in a deliberative dialogue with a human.
We enable the system to intervene if the user is too focused on their pre-existing opinion.
We report on a user study with 58 participants to test our model and the effect of the intervention mechanism.
arXiv Detail & Related papers (2023-08-17T15:48:23Z) - Perspectives on Large Language Models for Relevance Judgment [56.935731584323996]
Large language models (LLMs) claim that they can assist with relevance judgments.
It is not clear whether automated judgments can reliably be used in evaluations of retrieval systems.
arXiv Detail & Related papers (2023-04-13T13:08:38Z) - Persua: A Visual Interactive System to Enhance the Persuasiveness of
Arguments in Online Discussion [52.49981085431061]
Enhancing people's ability to write persuasive arguments could contribute to the effectiveness and civility in online communication.
We derived four design goals for a tool that helps users improve the persuasiveness of arguments in online discussions.
Persua is an interactive visual system that provides example-based guidance on persuasive strategies to enhance the persuasiveness of arguments.
arXiv Detail & Related papers (2022-04-16T08:07:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.