Related papers: Non-Collaborative User Simulators for Tool Agents

Non-Collaborative User Simulators for Tool Agents

URL: http://arxiv.org/abs/2509.23124v2
Date: Mon, 06 Oct 2025 12:23:18 GMT
Title: Non-Collaborative User Simulators for Tool Agents
Authors: Jeonghoon Shim, Woojung Song, Cheyon Jin, Seungwon KooK, Yohan Jo,
Abstract summary: We propose a novel user simulator architecture that simulates four categories of non-collaborative behaviors.<n>Our experiments on MultiWOZ and $tau$-bench reveal significant performance degradation in state-of-the-art tool agents when encountering non-collaborative users.
Score: 12.294827535425414
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Tool agents interact with users through multi-turn dialogues to accomplish various tasks. Recent studies have adopted user simulation methods to develop these agents in multi-turn settings. However, existing user simulators tend to be agent-friendly, exhibiting only cooperative behaviors, which fails to train and test agents against non-collaborative users in the real world. To address this, we propose a novel user simulator architecture that simulates four categories of non-collaborative behaviors: requesting unavailable services, digressing into tangential conversations, expressing impatience, and providing incomplete utterances. Our user simulator can simulate challenging and natural non-collaborative behaviors while reliably delivering all intents and information necessary to accomplish the task. Our experiments on MultiWOZ and $\tau$-bench reveal significant performance degradation in state-of-the-art tool agents when encountering non-collaborative users. We provide detailed analyses of agents' weaknesses under each non-collaborative condition, such as escalated hallucinations and dialogue breakdowns. Ultimately, we contribute an easily extensible user simulation framework to help the research community develop tool agents and preemptively diagnose them under challenging real-world conditions within their own services.

Related papers

UXSim: Towards a Hybrid User Search Simulation [2.50369129460887]
The true dynamism and personalization inherent in human-computer interaction demand a more integrated approach.<n>This work introduces UXSim, a novel framework that integrates both approaches.
arXiv Detail & Related papers (2026-02-27T18:14:34Z)
User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale [5.641245411366927]
We develop a framework for automated task-oriented multi-turn dialogue generation at scale.<n>Our generation pipeline operates as a versatile, plug-and-play module capable of initiating generation from any state.<n>It yields a high-density dataset that reflects the multifaceted demands of real-world human-agent interaction.
arXiv Detail & Related papers (2026-01-13T05:14:09Z)
TongSIM: A General Platform for Simulating Intelligent Machines [59.27575233453533]
Embodied intelligence focuses on training agents within realistic simulated environments.<n>TongSIM is a high-fidelity, general-purpose platform for training and evaluating embodied agents.
arXiv Detail & Related papers (2025-12-23T10:00:43Z)
Agentic Persona Control and Task State Tracking for Realistic User Simulation in Interactive Scenarios [0.0]
We present a novel multi-agent framework for realistic, explainable human user simulation in interactive scenarios.<n>We employ persona control and task state tracking to mirror human cognitive processes during goal-oriented conversations.
arXiv Detail & Related papers (2025-11-30T20:25:56Z)
Agent4FaceForgery: Multi-Agent LLM Framework for Realistic Face Forgery Detection [108.5042835056188]
This work introduces Agent4FaceForgery to address two fundamental problems.<n>How to capture the diverse intents and iterative processes of human forgery creation.<n>How to model the complex, often adversarial, text-image interactions that accompany forgeries in social media.
arXiv Detail & Related papers (2025-09-16T01:05:01Z)
UserBench: An Interactive Gym Environment for User-Centric Agents [110.77212949007958]
Large Language Models (LLMs)-based agents have made impressive progress in reasoning and tool use, but their ability to proactively collaborate with users remains underexplored.<n>We introduce UserBench, a user-centric benchmark designed to evaluate agents in multi-turn, preference-driven interactions.
arXiv Detail & Related papers (2025-07-29T17:34:12Z)
$τ^2$-Bench: Evaluating Conversational Agents in a Dual-Control Environment [32.345011712015435]
Existing benchmarks for AI agents simulate single-control environments.<n>We introduce $tau2$-bench, where both agent and user make use of tools to act in a shared, dynamic environment.<n>In particular, our experiments show significant performance drops when agents shift from no-user to dual-control.
arXiv Detail & Related papers (2025-06-09T17:52:18Z)
LAM SIMULATOR: Advancing Data Generation for Large Action Model Training via Online Exploration and Trajectory Feedback [121.78866929908871]
Large Action Models (LAMs) for AI Agents offer incredible potential but face challenges due to the need for high-quality training data.<n>We present LAM SIMULATOR, a comprehensive framework designed for online exploration of agentic tasks with high-quality feedback.<n>Our framework features a dynamic task query generator, an extensive collection of tools, and an interactive environment where Large Language Model (LLM) Agents can call tools and receive real-time feedback.
arXiv Detail & Related papers (2025-06-02T22:36:02Z)
USimAgent: Large Language Models for Simulating Search Users [33.17004578463697]
We introduce a Large Language Models-based user search behavior simulator, USimAgent. The simulator can simulate users' querying, clicking, and stopping behaviors during search. Empirical investigation on a real user behavior dataset shows that the simulator outperforms existing methods in query generation.
arXiv Detail & Related papers (2024-03-14T07:40:54Z)
Learning to Use Tools via Cooperative and Interactive Agents [58.77710337157665]
Tool learning empowers large language models (LLMs) as agents to use external tools and extend their utility. We propose ConAgents, a Cooperative and interactive Agents framework, which coordinates three specialized agents for tool selection, tool execution, and action calibration separately. Our experiments on three datasets show that the LLMs, when equipped with ConAgents, outperform baselines with substantial improvement.
arXiv Detail & Related papers (2024-03-05T15:08:16Z)
AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems [112.76941157194544]
We propose AgentCF for simulating user-item interactions in recommender systems through agent-based collaborative filtering. We creatively consider not only users but also items as agents, and develop a collaborative learning approach that optimize both kinds of agents together. Overall, the optimized agents exhibit diverse interaction behaviors within our framework, including user-item, user-user, item-item, and collective interactions.
arXiv Detail & Related papers (2023-10-13T16:37:14Z)
User Behavior Simulation with Large Language Model based Agents [116.74368915420065]
We propose an LLM-based agent framework and design a sandbox environment to simulate real user behaviors. Based on extensive experiments, we find that the simulated behaviors of our method are very close to the ones of real humans.
arXiv Detail & Related papers (2023-06-05T02:58:35Z)
MUG: Interactive Multimodal Grounding on User Interfaces [12.035123646959669]
We present MUG, a novel interactive task for multimodal grounding where a user and an agent work collaboratively on an interface screen. Prior works modeled multimodal UI grounding in one round: the user gives a command and the agent responds to the command. MUG allows multiple rounds of interactions such that upon seeing the agent responses, the user can give further commands for the agent to refine or even correct its actions.
arXiv Detail & Related papers (2022-09-29T21:08:18Z)
Multi-Agent Task-Oriented Dialog Policy Learning with Role-Aware Reward Decomposition [64.06167416127386]
We propose Multi-Agent Dialog Policy Learning, which regards both the system and the user as the dialog agents. Two agents interact with each other and are jointly learned simultaneously. Results show that our method can successfully build a system policy and a user policy simultaneously.
arXiv Detail & Related papers (2020-04-08T04:51:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.