Related papers: Echoes of Agreement: Argument Driven Opinion Shifts in Large Language Models

Echoes of Agreement: Argument Driven Opinion Shifts in Large Language Models

URL: http://arxiv.org/abs/2508.09759v1
Date: Mon, 11 Aug 2025 20:54:14 GMT
Title: Echoes of Agreement: Argument Driven Opinion Shifts in Large Language Models
Authors: Avneet Kaur,
Abstract summary: We conduct experiments for political bias evaluation in presence of supporting and refuting arguments.<n>Our experiments show that such arguments substantially alter model responses towards the direction of the provided argument.<n>These effects point to a sycophantic tendency in LLMs adapting their stance to align with the presented arguments.
Score: 0.36713387874278247
License: http://creativecommons.org/licenses/by/4.0/
Abstract: There have been numerous studies evaluating bias of LLMs towards political topics. However, how positions towards these topics in model outputs are highly sensitive to the prompt. What happens when the prompt itself is suggestive of certain arguments towards those positions remains underexplored. This is crucial for understanding how robust these bias evaluations are and for understanding model behaviour, as these models frequently interact with opinionated text. To that end, we conduct experiments for political bias evaluation in presence of supporting and refuting arguments. Our experiments show that such arguments substantially alter model responses towards the direction of the provided argument in both single-turn and multi-turn settings. Moreover, we find that the strength of these arguments influences the directional agreement rate of model responses. These effects point to a sycophantic tendency in LLMs adapting their stance to align with the presented arguments which has downstream implications for measuring political bias and developing effective mitigation strategies.

Related papers

SAD: A Large-Scale Strategic Argumentative Dialogue Dataset [60.33125467375306]
In practice, argumentation is often realized as multi-turn dialogue.<n>We present the first large-scale textbfStrategic textbfArgumentative textbfDialogue dataset, consisting of 392,822 examples.
arXiv Detail & Related papers (2026-01-12T11:11:37Z)
MMPersuade: A Dataset and Evaluation Framework for Multimodal Persuasion [73.99171322670772]
Large Vision-Language Models (LVLMs) are increasingly deployed in domains such as shopping, health, and news.<n> MMPersuade provides a unified framework for systematically studying multimodal persuasion dynamics in LVLMs.
arXiv Detail & Related papers (2025-10-26T17:39:21Z)
Disentangling Interaction and Bias Effects in Opinion Dynamics of Large Language Models [0.42481744176244507]
Large Language Models are increasingly used to simulate human opinion dynamics.<n>We present a Bayesian framework to disentangle and quantify three such biases.<n>Applying this framework to multi-step dialogues reveals that opinion trajectories tend to quickly converge to a shared attractor.
arXiv Detail & Related papers (2025-09-08T16:26:45Z)
Diverging Preferences: When do Annotators Disagree and do Models Know? [92.24651142187989]
We develop a taxonomy of disagreement sources spanning 10 categories across four high-level classes. We find that the majority of disagreements are in opposition with standard reward modeling approaches. We develop methods for identifying diverging preferences to mitigate their influence on evaluation and training.
arXiv Detail & Related papers (2024-10-18T17:32:22Z)
Bias in the Mirror: Are LLMs opinions robust to their own adversarial attacks ? [22.0383367888756]
Large language models (LLMs) inherit biases from their training data and alignment processes, influencing their responses in subtle ways. We introduce a novel approach where two instances of an LLM engage in self-debate, arguing opposing viewpoints to persuade a neutral version of the model. We evaluate how firmly biases hold and whether models are susceptible to reinforcing misinformation or shifting to harmful viewpoints.
arXiv Detail & Related papers (2024-10-17T13:06:02Z)
Covert Bias: The Severity of Social Views' Unalignment in Language Models Towards Implicit and Explicit Opinion [0.40964539027092917]
We evaluate the severity of bias toward a view by using a biased model in edge cases of excessive bias scenarios. Our findings reveal a discrepancy in LLM performance in identifying implicit and explicit opinions, with a general tendency of bias toward explicit opinions of opposing stances. The direct, incautious responses of the unaligned models suggest a need for further refinement of decisiveness.
arXiv Detail & Related papers (2024-08-15T15:23:00Z)
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models [61.45529177682614]
We challenge the prevailing constrained evaluation paradigm for values and opinions in large language models. We show that models give substantively different answers when not forced. We distill these findings into recommendations and open challenges in evaluating values and opinions in LLMs.
arXiv Detail & Related papers (2024-02-26T18:00:49Z)
A Theory of Response Sampling in LLMs: Part Descriptive and Part Prescriptive [53.08398658452411]
Large Language Models (LLMs) are increasingly utilized in autonomous decision-making.<n>We show that this sampling behavior resembles that of human decision-making.<n>We show that this deviation of a sample from the statistical norm towards a prescriptive component consistently appears in concepts across diverse real-world domains.
arXiv Detail & Related papers (2024-02-16T18:28:43Z)
Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation. We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process. We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z)
The Role of Pragmatic and Discourse Context in Determining Argument Impact [39.70446357000737]
This paper presents a new dataset to initiate the study of this aspect of argumentation. It consists of a diverse collection of arguments covering 741 controversial topics and comprising over 47,000 claims. We propose predictive models that incorporate the pragmatic and discourse context of argumentative claims and show that they outperform models that rely on claim-specific linguistic features for predicting the perceived impact of individual claims within a particular line of argument.
arXiv Detail & Related papers (2020-04-06T23:00:37Z)
What Changed Your Mind: The Roles of Dynamic Topics and Discourse in Argumentation Process [78.4766663287415]
This paper presents a study that automatically analyzes the key factors in argument persuasiveness. We propose a novel neural model that is able to track the changes of latent topics and discourse in argumentative conversations.
arXiv Detail & Related papers (2020-02-10T04:27:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.