Persuasion Propagation in LLM Agents
- URL: http://arxiv.org/abs/2602.00851v1
- Date: Sat, 31 Jan 2026 18:33:14 GMT
- Title: Persuasion Propagation in LLM Agents
- Authors: Hyejun Jeong, Amir Houmansadr, Shlomo Zilberstein, Eugene Bagdasarian,
- Abstract summary: We study how belief-level intervention can influence downstream task behavior.<n>Across web research and coding tasks, we find that on-the-fly persuasion induces weak and inconsistent behavioral effects.<n>When the belief state is explicitly specified at task time, belief-prefilled agents conduct on average 26.9% fewer searches and visit 16.9% fewer unique sources than neutral-prefilled agents.
- Score: 23.64887423923855
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern AI agents increasingly combine conversational interaction with autonomous task execution, such as coding and web research, raising a natural question: what happens when an agent engaged in long-horizon tasks is subjected to user persuasion? We study how belief-level intervention can influence downstream task behavior, a phenomenon we name \emph{persuasion propagation}. We introduce a behavior-centered evaluation framework that distinguishes between persuasion applied during or prior to task execution. Across web research and coding tasks, we find that on-the-fly persuasion induces weak and inconsistent behavioral effects. In contrast, when the belief state is explicitly specified at task time, belief-prefilled agents conduct on average 26.9\% fewer searches and visit 16.9\% fewer unique sources than neutral-prefilled agents. These results suggest that persuasion, even in prior interaction, can affect the agent's behavior, motivating behavior-level evaluation in agentic systems.
Related papers
- The Why Behind the Action: Unveiling Internal Drivers via Agentic Attribution [63.61358761489141]
Large Language Model (LLM)-based agents are widely used in real-world applications such as customer service, web navigation, and software engineering.<n>We propose a novel framework for textbfgeneral agentic attribution, designed to identify the internal factors driving agent actions regardless of the task outcome.<n>We validate our framework across a diverse suite of agentic scenarios, including standard tool use and subtle reliability risks like memory-induced bias.
arXiv Detail & Related papers (2026-01-21T15:22:21Z) - It's a TRAP! Task-Redirecting Agent Persuasion Benchmark for Web Agents [52.81924177620322]
Web-based agents powered by large language models are increasingly used for tasks such as email management or professional networking.<n>Their reliance on dynamic web content makes them vulnerable to prompt injection attacks: adversarial instructions hidden in interface elements that persuade the agent to divert from its original task.<n>We introduce the Task-Redirecting Agent Persuasion Benchmark (TRAP), an evaluation for studying how persuasion techniques misguide autonomous web agents on realistic tasks.
arXiv Detail & Related papers (2025-12-29T01:09:10Z) - Unlocking the Power of Multi-Agent LLM for Reasoning: From Lazy Agents to Deliberation [42.38513187601995]
Large Language Models (LLMs) trained with reinforcement learning and verifiable rewards have achieved strong results on complex reasoning tasks.<n>Recent work extends this paradigm to a multi-agent setting, where a meta-thinking agent proposes plans and monitors progress while a reasoning agent executes subtasks through sequential conversational turns.<n>Despite promising performance, we identify a critical limitation: lazy agent behavior, in which one agent dominates while the other contributes little, undermining collaboration and collapsing the setup to an ineffective single agent.<n>We propose a verifiable reward mechanism that encourages deliberation by allowing the reasoning agent to discard noisy outputs, consolidate instructions, and restart its reasoning process
arXiv Detail & Related papers (2025-11-04T06:37:31Z) - Completion $\neq$ Collaboration: Scaling Collaborative Effort with Agents [48.95020665909723]
We argue for a shift from building and assessing task completion agents to developing collaborative agents.<n>We introduce collaborative effort scaling, a framework that captures how an agent's utility grows with increasing user involvement.
arXiv Detail & Related papers (2025-10-29T17:47:18Z) - Model Editing as a Double-Edged Sword: Steering Agent Ethical Behavior Toward Beneficence or Harm [57.00627691433355]
We frame agent behavior steering as a model editing task, which we term Behavior Editing.<n>We introduce BehaviorBench, a benchmark grounded in psychological moral theories.<n>We demonstrate that Behavior Editing can be used to promote ethical and benevolent behavior or, conversely, to induce harmful or malicious behavior.
arXiv Detail & Related papers (2025-06-25T16:51:51Z) - The Limits of Predicting Agents from Behaviour [16.80911584745046]
We provide a precise answer under the assumption that the agent's behaviour is guided by a world model.<n>Our contribution is the derivation of novel bounds on the agent's behaviour in new (unseen) deployment environments.<n>We discuss the implications of these results for several research areas including fairness and safety.
arXiv Detail & Related papers (2025-06-03T14:24:58Z) - Dissecting Adversarial Robustness of Multimodal LM Agents [70.2077308846307]
We manually create 200 targeted adversarial tasks and evaluation scripts in a realistic threat model on top of VisualWebArena.<n>We find that we can successfully break latest agents that use black-box frontier LMs, including those that perform reflection and tree search.<n>We also use ARE to rigorously evaluate how the robustness changes as new components are added.
arXiv Detail & Related papers (2024-06-18T17:32:48Z) - Sim-to-Real Causal Transfer: A Metric Learning Approach to Causally-Aware Interaction Representations [58.96953392466609]
We take an in-depth look at the causal awareness of modern representations of agent interactions.<n>We show that recent representations are already partially resilient to perturbations of non-causal agents.<n>We introduce a metric learning approach that regularizes latent representations with causal annotations.
arXiv Detail & Related papers (2023-12-07T18:57:03Z) - Resonating Minds -- Emergent Collaboration Through Hierarchical Active
Inference [0.0]
We investigate how efficient, automatic coordination processes at the level of mental states (intentions, goals) can lead to collaborative situated problem-solving.
We present a model of hierarchical active inference for collaborative agents (HAICA)
We show that belief resonance and active inference allow for quick and efficient agent coordination, and thus can serve as a building block for collaborative cognitive agents.
arXiv Detail & Related papers (2021-12-02T13:23:44Z) - Mitigating Negative Side Effects via Environment Shaping [27.400267388362654]
Agents operating in unstructured environments often produce negative side effects (NSE)
We present an algorithm to solve this problem and analyze its theoretical properties.
Empirical evaluation of our approach shows that the proposed framework can successfully mitigate NSE, without affecting the agent's ability to complete its assigned task.
arXiv Detail & Related papers (2021-02-13T22:15:00Z) - Studying the Effects of Cognitive Biases in Evaluation of Conversational
Agents [10.248512149493443]
We conduct a study with 77 crowdsourced workers to understand the role of cognitive biases, specifically anchoring bias, when humans are asked to evaluate the output of conversational agents.
We find increased consistency in ratings across two experimental conditions may be a result of anchoring bias.
arXiv Detail & Related papers (2020-02-18T23:52:39Z) - Intrinsic Motivation for Encouraging Synergistic Behavior [55.10275467562764]
We study the role of intrinsic motivation as an exploration bias for reinforcement learning in sparse-reward synergistic tasks.
Our key idea is that a good guiding principle for intrinsic motivation in synergistic tasks is to take actions which affect the world in ways that would not be achieved if the agents were acting on their own.
arXiv Detail & Related papers (2020-02-12T19:34:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.