Related papers: Agentic Conversational Search with Contextualized Reasoning via Reinforcement Learning

Agentic Conversational Search with Contextualized Reasoning via Reinforcement Learning

URL: http://arxiv.org/abs/2601.13115v1
Date: Mon, 19 Jan 2026 14:55:54 GMT
Title: Agentic Conversational Search with Contextualized Reasoning via Reinforcement Learning
Authors: Fengran Mo, Yifan Gao, Sha Li, Hansi Zeng, Xin Liu, Zhaoxuan Tan, Xian Li, Jianshu Chen, Dakuo Wang, Meng Jiang,
Abstract summary: We introduce a conversational agent that interleaves search and reasoning across turns, enabling exploratory and adaptive behaviors learned through reinforcement learning (RL) training.<n>The experimental results across four widely used conversational benchmarks demonstrate the effectiveness of our methods.
Score: 66.52010873968383
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) have become a popular interface for human-AI interaction, supporting information seeking and task assistance through natural, multi-turn dialogue. To respond to users within multi-turn dialogues, the context-dependent user intent evolves across interactions, requiring contextual interpretation, query reformulation, and dynamic coordination between retrieval and generation. Existing studies usually follow static rewrite, retrieve, and generate pipelines, which optimize different procedures separately and overlook the mixed-initiative action optimization simultaneously. Although the recent developments in deep search agents demonstrate the effectiveness in jointly optimizing retrieval and generation via reasoning, these approaches focus on single-turn scenarios, which might lack the ability to handle multi-turn interactions. We introduce a conversational agent that interleaves search and reasoning across turns, enabling exploratory and adaptive behaviors learned through reinforcement learning (RL) training with tailored rewards towards evolving user goals. The experimental results across four widely used conversational benchmarks demonstrate the effectiveness of our methods by surpassing several existing strong baselines.

Related papers

Towards Context-aware Reasoning-enhanced Generative Searching in E-commerce [61.03081096959132]
We propose a context-aware reasoning-enhanced generative search framework for better textbfunderstanding the complicated context.<n>Our approach achieves superior performance compared with strong baselines, validating its effectiveness for search-based recommendation.
arXiv Detail & Related papers (2025-10-19T16:46:11Z)
UniConv: Unifying Retrieval and Response Generation for Large Language Models in Conversations [71.79210031338464]
We show how to unify dense retrieval and response generation for large language models in conversation.<n>We conduct joint fine-tuning with different objectives and design two mechanisms to reduce the inconsistency risks.<n>The evaluations on five conversational search datasets demonstrate that our unified model can mutually improve both tasks and outperform the existing baselines.
arXiv Detail & Related papers (2025-07-09T17:02:40Z)
Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language Models [8.08979200534563]
Real-world applications demand sophisticated multi-turn interactions.<n>Recent advancements in large language models (LLMs) have revolutionized their ability to handle single-turn tasks.
arXiv Detail & Related papers (2025-04-07T04:00:08Z)
A Survey on Multi-Turn Interaction Capabilities of Large Language Models [47.05742294162551]
Multi-turn interaction in the dialogue system research refers to a system's ability to maintain context across multiple dialogue turns.<n>Recent advancements in large language models (LLMs) have significantly expanded the scope of multi-turn interaction.
arXiv Detail & Related papers (2025-01-17T05:21:49Z)
Enabling Real-Time Conversations with Minimal Training Costs [61.80370154101649]
This paper presents a new duplex decoding approach that enhances large language models with duplex ability, requiring minimal training. Experimental results indicate that our proposed method significantly enhances the naturalness and human-likeness of user-AI interactions with minimal training costs.
arXiv Detail & Related papers (2024-09-18T06:27:26Z)
Re-mine, Learn and Reason: Exploring the Cross-modal Semantic Correlations for Language-guided HOI detection [57.13665112065285]
Human-Object Interaction (HOI) detection is a challenging computer vision task. We present a framework that enhances HOI detection by incorporating structured text knowledge.
arXiv Detail & Related papers (2023-07-25T14:20:52Z)
Utterance Rewriting with Contrastive Learning in Multi-turn Dialogue [22.103162555263143]
We introduce contrastive learning and multi-task learning to jointly model the problem. Our proposed model achieves state-of-the-art performance on several public datasets.
arXiv Detail & Related papers (2022-03-22T10:13:27Z)
Learning an Effective Context-Response Matching Model with Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination. We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner. Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.