Related papers: Mind the Gap: Linguistic Divergence and Adaptation Strategies in Human-LLM Assistant vs. Human-Human Interactions

Mind the Gap: Linguistic Divergence and Adaptation Strategies in Human-LLM Assistant vs. Human-Human Interactions

URL: http://arxiv.org/abs/2510.02645v1
Date: Fri, 03 Oct 2025 00:45:37 GMT
Title: Mind the Gap: Linguistic Divergence and Adaptation Strategies in Human-LLM Assistant vs. Human-Human Interactions
Authors: Fulei Zhang, Zhou Yu,
Abstract summary: Large Language Models (LLMs) are increasingly deployed in customer-facing applications.<n>Our study shows significant differences in grammatical fluency, politeness, and lexical diversity in user language between the two settings.<n>To enhance robustness to post-launch communication style changes, we experimented with two strategies.
Score: 14.21024646209994
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As Large Language Models (LLMs) are increasingly deployed in customer-facing applications, a critical yet underexplored question is how users communicate differently with LLM chatbots compared to human agent. In this study, we present empirical evidence that users adopt distinct communication styles when users interact with chatbots versus human agents. Our analysis reveals significant differences in grammatical fluency, politeness, and lexical diversity in user language between the two settings. These findings suggest that models trained exclusively on human-human interaction data may not adequately accommodate the communication style shift that occurs once an LLM chatbot is deployed. To enhance LLM robustness to post-launch communication style changes, we experimented with two strategies: (1) data augmentation during the post-training phase and (2) inference-time user message reformulation. Our results indicate that models trained on stylistically diverse datasets significantly outperform those trained exclusively on original or stylistically uniform datasets, while inference-time reformulation proved less effective. These insights help us to better adapt our models for improved LLM-user interaction experiences.

Related papers

MT-PingEval: Evaluating Multi-Turn Collaboration with Private Information Games [70.37904949359938]
We evaluate language models in multi-turn interactions using a suite of collaborative games that require effective communication about private information.<n>We find that language models are unable to use interactive collaboration to improve over the non-interactive baseline scenario.<n>We analyze the linguistic features of these dialogues, assessing the roles of sycophancy, information density, and discourse coherence.
arXiv Detail & Related papers (2026-02-27T17:13:20Z)
Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning [52.07170679746533]
Large Language Models (LLMs) are increasingly used to simulate human users in interactive settings such as therapy, education, and social role-play.<n>We introduce a unified framework for evaluating and improving persona consistency in LLM-generated dialogue.<n>We define three automatic metrics: prompt-to-line consistency, line-to-line consistency, and Q&A consistency, that capture different types of persona drift and validate each against human annotations.
arXiv Detail & Related papers (2025-10-31T19:40:41Z)
Beyond One-Way Influence: Bidirectional Opinion Dynamics in Multi-Turn Human-LLM Interactions [15.551196286270779]
Large language model (LLM)-powered chatbots are increasingly used for opinion exploration.<n>This study investigates how human opinions barely shifted, while LLM outputs changed more substantially.<n>Analysis of multi-turn conversations revealed that exchanges involving participants' personal stories were most likely to trigger stance changes for both humans and LLMs.
arXiv Detail & Related papers (2025-10-22T21:38:10Z)
The Era of Real-World Human Interaction: RL from User Conversations [45.2392745984914]
We introduce Reinforcement Learning from Human Interaction (RLHI), a paradigm that learns directly from in-the-wild user conversations.<n>We develop two complementary methods: (1) RLHI with User-Guided Rewrites, which revises unsatisfactory model outputs based on users' natural-language follow-up responses, and (2) RLHI with User-Based Rewards, which learns via a reward model conditioned on knowledge of the user's long-term interaction history.
arXiv Detail & Related papers (2025-09-29T17:50:31Z)
FLEXI: Benchmarking Full-duplex Human-LLM Speech Interaction [49.83226596963294]
Speech-computer human interaction enables real-time spoken dialogue systems.<n>Modelling and benchmarking these models remains a fundamental challenge.<n>We introduce FLEXI, the first benchmark for full-human spoken interaction.
arXiv Detail & Related papers (2025-09-26T11:57:42Z)
Human vs. Agent in Task-Oriented Conversations [22.743152820695588]
This work presents the first systematic comparison between large language models (LLMs)-simulated users and human users in personalized task-oriented conversations.<n>Our analysis reveals significant behavioral differences between the two user types in problem-solving approaches.
arXiv Detail & Related papers (2025-09-22T11:30:39Z)
DiverseDialogue: A Methodology for Designing Chatbots with Human-Like Diversity [5.388338680646657]
We show that GPT-4o mini, when used as simulated human participants, systematically differ from those between actual humans across multiple linguistic features. We propose an approach that automatically generates prompts for user simulations by incorporating features derived from real human interactions. Our method of prompt optimization, tailored to target specific linguistic features, shows significant improvements.
arXiv Detail & Related papers (2024-08-30T21:33:58Z)
PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, a framework for better data construction and model tuning.<n>For insufficient data usage, we incorporate strategies such as Chain-of-Thought prompting and anti-induction.<n>For rigid behavior patterns, we design the tuning process and introduce automated DPO to enhance the specificity and dynamism of the models' personalities.
arXiv Detail & Related papers (2024-07-17T08:13:22Z)
Modulating Language Model Experiences through Frictions [56.17593192325438]
Over-consumption of language model outputs risks propagating unchecked errors in the short-term and damaging human capabilities for critical thinking in the long-term. We propose selective frictions for language model experiences, inspired by behavioral science interventions, to dampen misuse.
arXiv Detail & Related papers (2024-06-24T16:31:11Z)
LLM Agents in Interaction: Measuring Personality Consistency and Linguistic Alignment in Interacting Populations of Large Language Models [4.706971067968811]
We create a two-group population of large language models (LLMs) agents using a simple variability-inducing sampling algorithm. We administer personality tests and submit the agents to a collaborative writing task, finding that different profiles exhibit different degrees of personality consistency and linguistic alignment to their conversational partners.
arXiv Detail & Related papers (2024-02-05T11:05:20Z)
On the interaction between supervision and self-play in emergent communication [82.290338507106]
We investigate the relationship between two categories of learning signals with the ultimate goal of improving sample efficiency. We find that first training agents via supervised learning on human data followed by self-play outperforms the converse.
arXiv Detail & Related papers (2020-02-04T02:35:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.