Related papers: Beyond One-Way Influence: Bidirectional Opinion Dynamics in Multi-Turn Human-LLM Interactions

Beyond One-Way Influence: Bidirectional Opinion Dynamics in Multi-Turn Human-LLM Interactions

URL: http://arxiv.org/abs/2510.20039v1
Date: Wed, 22 Oct 2025 21:38:10 GMT
Title: Beyond One-Way Influence: Bidirectional Opinion Dynamics in Multi-Turn Human-LLM Interactions
Authors: Yuyang Jiang, Longjie Guo, Yuchen Wu, Aylin Caliskan, Tanu Mitra, Hua Shen,
Abstract summary: Large language model (LLM)-powered chatbots are increasingly used for opinion exploration.<n>This study investigates how human opinions barely shifted, while LLM outputs changed more substantially.<n>Analysis of multi-turn conversations revealed that exchanges involving participants' personal stories were most likely to trigger stance changes for both humans and LLMs.
Score: 15.551196286270779
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language model (LLM)-powered chatbots are increasingly used for opinion exploration. Prior research examined how LLMs alter user views, yet little work extended beyond one-way influence to address how user input can affect LLM responses and how such bi-directional influence manifests throughout the multi-turn conversations. This study investigates this dynamic through 50 controversial-topic discussions with participants (N=266) across three conditions: static statements, standard chatbot, and personalized chatbot. Results show that human opinions barely shifted, while LLM outputs changed more substantially, narrowing the gap between human and LLM stance. Personalization amplified these shifts in both directions compared to the standard setting. Analysis of multi-turn conversations further revealed that exchanges involving participants' personal stories were most likely to trigger stance changes for both humans and LLMs. Our work highlights the risk of over-alignment in human-LLM interaction and the need for careful design of personalized chatbots to more thoughtfully and stably align with users.

Related papers

Emulating Aggregate Human Choice Behavior and Biases with GPT Conversational Agents [0.48439699124726004]
Large language models (LLMs) have been shown to reproduce well-known biases.<n>We adapted three well-established decision scenarios into a conversational setting and conducted a human experiment.<n>We found notable differences between models in how they aligned human behavior.
arXiv Detail & Related papers (2026-02-05T12:33:05Z)
DEBATE: A Large-Scale Benchmark for Role-Playing LLM Agents in Multi-Agent, Long-Form Debates [10.609797175227644]
We introduce DEBATE, the first large-scale empirical benchmark to evaluate the authenticity of the interaction between multi-agent role-playing LLMs.<n>We systematically evaluate and identify critical discrepancies between simulated and authentic group dynamics.
arXiv Detail & Related papers (2025-10-29T02:21:10Z)
Mind the Gap: Linguistic Divergence and Adaptation Strategies in Human-LLM Assistant vs. Human-Human Interactions [14.21024646209994]
Large Language Models (LLMs) are increasingly deployed in customer-facing applications.<n>Our study shows significant differences in grammatical fluency, politeness, and lexical diversity in user language between the two settings.<n>To enhance robustness to post-launch communication style changes, we experimented with two strategies.
arXiv Detail & Related papers (2025-10-03T00:45:37Z)
Arbiters of Ambivalence: Challenges of Using LLMs in No-Consensus Tasks [52.098988739649705]
This study examines the biases and limitations of LLMs in three roles: answer generator, judge, and debater.<n>We develop a no-consensus'' benchmark by curating examples that encompass a variety of a priori ambivalent scenarios.<n>Our results show that while LLMs can provide nuanced assessments when generating open-ended answers, they tend to take a stance on no-consensus topics when employed as judges or debaters.
arXiv Detail & Related papers (2025-05-28T01:31:54Z)
LLMs syntactically adapt their language use to their conversational partner [58.92470092706263]
It has been frequently observed that human speakers align their language use with each other during conversations.<n>We construct a corpus of conversations between large language models (LLMs) and find that two LLM agents end up making more similar syntactic choices as conversations go on.
arXiv Detail & Related papers (2025-03-10T15:37:07Z)
Can (A)I Change Your Mind? [0.6990493129893112]
Conducted entirely in Hebrew with 200 participants, the study assessed the persuasive effects of both LLM and human interlocutors on controversial civil policy topics.<n>Results indicated that participants adopted LLM and human perspectives similarly, with significant opinion changes evident across all conditions.<n>These findings demonstrate LLM-based agents' robust persuasive capabilities across diverse sources and settings, highlighting their potential impact on shaping public opinions.
arXiv Detail & Related papers (2025-03-03T18:59:54Z)
Examining Identity Drift in Conversations of LLM Agents [5.12659586713042]
This study examines identity consistency across nine Large Language Models (LLMs)<n>Experiments involve multi-turn conversations on personal themes, analyzed in qualitative and quantitative ways.
arXiv Detail & Related papers (2024-12-01T13:19:32Z)
X-TURING: Towards an Enhanced and Efficient Turing Test for Long-Term Dialogue Agents [56.64615470513102]
The Turing test examines whether AIs exhibit human-like behaviour in natural language conversations.<n>Traditional setting limits each participant to one message at a time and requires constant human participation.<n>This paper proposes textbftextscX-Turing, which enhances the original test with a textitburst dialogue pattern.
arXiv Detail & Related papers (2024-08-19T09:57:28Z)
Modulating Language Model Experiences through Frictions [56.17593192325438]
Over-consumption of language model outputs risks propagating unchecked errors in the short-term and damaging human capabilities for critical thinking in the long-term. We propose selective frictions for language model experiences, inspired by behavioral science interventions, to dampen misuse.
arXiv Detail & Related papers (2024-06-24T16:31:11Z)
LLM Agents in Interaction: Measuring Personality Consistency and Linguistic Alignment in Interacting Populations of Large Language Models [4.706971067968811]
We create a two-group population of large language models (LLMs) agents using a simple variability-inducing sampling algorithm. We administer personality tests and submit the agents to a collaborative writing task, finding that different profiles exhibit different degrees of personality consistency and linguistic alignment to their conversational partners.
arXiv Detail & Related papers (2024-02-05T11:05:20Z)
DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback [61.28463542324576]
We present DRESS, a large vision language model (LVLM) that innovatively exploits Natural Language feedback (NLF) from Large Language Models. We propose a novel categorization of the NLF into two key types: critique and refinement. Our experimental results demonstrate that DRESS can generate more helpful (9.76%), honest (11.52%), and harmless (21.03%) responses.
arXiv Detail & Related papers (2023-11-16T18:37:29Z)
BotChat: Evaluating LLMs' Capabilities of Having Multi-Turn Dialogues [72.65163468440434]
This report provides a preliminary evaluation of existing large language models for human-style multi-turn chatting. We prompt large language models (LLMs) to generate a full multi-turn dialogue based on the ChatSEED, utterance by utterance. We find GPT-4 can generate human-style multi-turn dialogues with impressive quality, significantly outperforms its counterparts.
arXiv Detail & Related papers (2023-10-20T16:53:51Z)
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena [76.21004582932268]
We examine the usage and limitations of LLM-as-a-judge, including position, verbosity, and self-enhancement biases. We then verify the agreement between LLM judges and human preferences by introducing two benchmarks: MT-bench, a multi-turn question set; and Arena, a crowdsourced battle platform.
arXiv Detail & Related papers (2023-06-09T05:55:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.