Related papers: RECAP: REwriting Conversations for Intent Understanding in Agentic Planning

RECAP: REwriting Conversations for Intent Understanding in Agentic Planning

URL: http://arxiv.org/abs/2509.04472v1
Date: Fri, 29 Aug 2025 20:45:37 GMT
Title: RECAP: REwriting Conversations for Intent Understanding in Agentic Planning
Authors: Kushan Mitra, Dan Zhang, Hannah Kim, Estevam Hruschka,
Abstract summary: Real-world dialogues are often ambiguous, underspecified, or dynamic.<n>Traditional classification-based approaches struggle to generalize in open-ended settings.<n>We propose RECAP, a new benchmark designed to evaluate and advance intent rewriting.
Score: 14.28070179801169
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Understanding user intent is essential for effective planning in conversational assistants, particularly those powered by large language models (LLMs) coordinating multiple agents. However, real-world dialogues are often ambiguous, underspecified, or dynamic, making intent detection a persistent challenge. Traditional classification-based approaches struggle to generalize in open-ended settings, leading to brittle interpretations and poor downstream planning. We propose RECAP (REwriting Conversations for Agent Planning), a new benchmark designed to evaluate and advance intent rewriting, reframing user-agent dialogues into concise representations of user goals. RECAP captures diverse challenges such as ambiguity, intent drift, vagueness, and mixed-goal conversations. Alongside the dataset, we introduce an LLM-based evaluator that assesses planning utility given the rewritten intent. Using RECAP, we develop a prompt-based rewriting approach that outperforms baselines. We further demonstrate that fine-tuning two DPO-based rewriters yields additional utility gains. Our results highlight intent rewriting as a critical and tractable component for improving agent planning in open-domain dialogue systems.

Related papers

Exploring Plan Space through Conversation: An Agentic Framework for LLM-Mediated Explanations in Planning [10.679298682391817]
We present a multi-agent Large Language Model architecture that is agnostic to the explanation framework and enables user- and context-dependent interactive explanations.<n>We also describe an instantiation of this framework for goal-conflict explanations, which we use to conduct a user study comparing the LLM-powered interaction with a baseline template-based explanation interface.
arXiv Detail & Related papers (2026-03-02T16:58:18Z)
ReIn: Conversational Error Recovery with Reasoning Inception [43.5498321001366]
This work focuses on error recovery, which necessitates the accurate diagnosis of erroneous dialogue contexts and execution of proper recovery plans.<n>We propose Reasoning Inception (ReIn), a test-time intervention method that plants an initial reasoning into the agent's decision-making process.<n>We evaluate ReIn by systematically simulating conversational failure scenarios that directly hinder successful completion of user goals.
arXiv Detail & Related papers (2026-02-19T02:37:29Z)
ChatSOP: An SOP-Guided MCTS Planning Framework for Controllable LLM Dialogue Agents [52.7201882529976]
We propose SOP-guided Monte Carlo Tree Search (MCTS) planning framework to enhance controllability of dialogue agents.<n>To enable this, we curate a dataset comprising SOP-annotated multi-scenario dialogues, generated using a semi-automated role-playing system with GPT-4o.<n>We also propose a novel method that integrates Chain of Thought reasoning with supervised fine-tuning for SOP prediction.
arXiv Detail & Related papers (2024-07-04T12:23:02Z)
Ask-before-Plan: Proactive Language Agents for Real-World Planning [68.08024918064503]
Proactive Agent Planning requires language agents to predict clarification needs based on user-agent conversation and agent-environment interaction. We propose a novel multi-agent framework, Clarification-Execution-Planning (textttCEP), which consists of three agents specialized in clarification, execution, and planning.
arXiv Detail & Related papers (2024-06-18T14:07:28Z)
Unsupervised End-to-End Task-Oriented Dialogue with LLMs: The Power of the Noisy Channel [9.082443585886127]
Training task-oriented dialogue systems typically require turn-level annotations for interacting with their APIs. Unlabeled data and a schema definition are sufficient for building a working task-oriented dialogue system, completely unsupervised. We propose an innovative approach using expectation-maximization (EM) that infers turn-level annotations as latent variables.
arXiv Detail & Related papers (2024-04-23T16:51:26Z)
Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents [110.25679611755962]
Current language model-driven agents often lack mechanisms for effective user participation, which is crucial given the vagueness commonly found in user instructions. We introduce Intention-in-Interaction (IN3), a novel benchmark designed to inspect users' implicit intentions through explicit queries. We empirically train Mistral-Interact, a powerful model that proactively assesses task vagueness, inquires user intentions, and refines them into actionable goals.
arXiv Detail & Related papers (2024-02-14T14:36:30Z)
JoTR: A Joint Transformer and Reinforcement Learning Framework for Dialog Policy Learning [53.83063435640911]
Dialogue policy learning (DPL) is a crucial component of dialogue modelling. We introduce a novel framework, JoTR, to generate flexible dialogue actions. Unlike traditional methods, JoTR formulates a word-level policy that allows for a more dynamic and adaptable dialogue action generation.
arXiv Detail & Related papers (2023-09-01T03:19:53Z)
Controllable Mixed-Initiative Dialogue Generation through Prompting [50.03458333265885]
Mixed-initiative dialogue tasks involve repeated exchanges of information and conversational control. Agents gain control by generating responses that follow particular dialogue intents or strategies, prescribed by a policy planner. Standard approach has been fine-tuning pre-trained language models to perform generation conditioned on these intents. We instead prompt large language models as a drop-in replacement to fine-tuning on conditional generation.
arXiv Detail & Related papers (2023-05-06T23:11:25Z)
Target-Guided Dialogue Response Generation Using Commonsense and Data Augmentation [32.764356638437214]
We introduce a new technique for target-guided response generation. We also propose techniques to re-purpose existing dialogue datasets for target-guided generation. Our work generally enables dialogue system designers to exercise more control over the conversations that their systems produce.
arXiv Detail & Related papers (2022-05-19T04:01:40Z)
Improved Goal Oriented Dialogue via Utterance Generation and Look Ahead [5.062869359266078]
intent prediction can be improved by training a deep text-to-text neural model to generate successive user utterances from unlabeled dialogue data. We present a novel look-ahead approach that uses user utterance generation to improve intent prediction in time.
arXiv Detail & Related papers (2021-10-24T11:12:48Z)
Structural Pre-training for Dialogue Comprehension [51.215629336320305]
We present SPIDER, Structural Pre-traIned DialoguE Reader, to capture dialogue exclusive features. To simulate the dialogue-like features, we propose two training objectives in addition to the original LM objectives. Experimental results on widely used dialogue benchmarks verify the effectiveness of the newly introduced self-supervised tasks.
arXiv Detail & Related papers (2021-05-23T15:16:54Z)
Personalized Query Rewriting in Conversational AI Agents [7.086654234990377]
We propose a query rewriting approach by leveraging users' historically successful interactions as a form of memory. We present a neural retrieval model and a pointer-generator network with hierarchical attention and show that they perform significantly better at the query rewriting task with the aforementioned user memories than without.
arXiv Detail & Related papers (2020-11-09T20:45:39Z)
Generalizable and Explainable Dialogue Generation via Explicit Action Learning [33.688270031454095]
Conditioned response generation serves as an effective approach to optimize task completion and language quality. latent action learning is introduced to map each utterance to a latent representation. This approach is prone to over-dependence on the training data, and the generalization capability is thus restricted. Our proposed approach outperforms latent action baselines on MultiWOZ, a benchmark multi-domain dataset.
arXiv Detail & Related papers (2020-10-08T04:37:22Z)
Learning an Effective Context-Response Matching Model with Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination. We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner. Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.