Related papers: Large Reasoning Models in Agent Scenarios: Exploring the Necessity of Reasoning Capabilities

Large Reasoning Models in Agent Scenarios: Exploring the Necessity of Reasoning Capabilities

URL: http://arxiv.org/abs/2503.11074v1
Date: Fri, 14 Mar 2025 04:34:31 GMT
Title: Large Reasoning Models in Agent Scenarios: Exploring the Necessity of Reasoning Capabilities
Authors: Xueyang Zhou, Guiyao Tie, Guowen Zhang, Weidong Wang, Zhigang Zuo, Di Wu, Duanfeng Chu, Pan Zhou, Lichao Sun, Neil Zhenqiang Gong,
Abstract summary: We propose the LaRMA framework, encompassing nine tasks across Tool Usage, Plan Design, and Problem Solving.<n>Our findings address four research questions: LRMs surpass LLMs in reasoning-intensive tasks like Plan Design, leveraging iterative reflection for superior outcomes.<n>LRMs' enhanced reasoning incurs higher computational costs, prolonged processing, and behavioral challenges, including overthinking and fact-ignoring tendencies.
Score: 74.35956310688164
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The rise of Large Reasoning Models (LRMs) signifies a paradigm shift toward advanced computational reasoning. Yet, this progress disrupts traditional agent frameworks, traditionally anchored by execution-oriented Large Language Models (LLMs). To explore this transformation, we propose the LaRMA framework, encompassing nine tasks across Tool Usage, Plan Design, and Problem Solving, assessed with three top LLMs (e.g., Claude3.5-sonnet) and five leading LRMs (e.g., DeepSeek-R1). Our findings address four research questions: LRMs surpass LLMs in reasoning-intensive tasks like Plan Design, leveraging iterative reflection for superior outcomes; LLMs excel in execution-driven tasks such as Tool Usage, prioritizing efficiency; hybrid LLM-LRM configurations, pairing LLMs as actors with LRMs as reflectors, optimize agent performance by blending execution speed with reasoning depth; and LRMs' enhanced reasoning incurs higher computational costs, prolonged processing, and behavioral challenges, including overthinking and fact-ignoring tendencies. This study fosters deeper inquiry into LRMs' balance of deep thinking and overthinking, laying a critical foundation for future agent design advancements.

Related papers

CoT-RAG: Integrating Chain of Thought and Retrieval-Augmented Generation to Enhance Reasoning in Large Language Models [14.784841713647682]
CoT-RAG is a novel reasoning framework with three key designs. It features Knowledge Graph-driven CoT Generation, Learnable Knowledge Case-aware RAG, and Pseudo-Program Prompting Execution. Compared with the-state-of-the-art methods, CoT-RAG exhibits a significant accuracy improvement, ranging from 4.0% to 23.0%.
arXiv Detail & Related papers (2025-04-18T07:55:09Z)
Revisiting Prompt Optimization with Large Reasoning Models-A Case Study on Event Extraction [8.88001387249786]
Large Reasoning Models (LRMs) such as DeepSeek-R1 and OpenAI o1 have demonstrated remarkable capabilities in various reasoning tasks. Their strong capability to generate and reason over intermediate thoughts has led to arguments that they may no longer require extensive prompt engineering or optimization to interpret human instructions. In this work, we aim to systematically study this open question, using the structured task of event extraction for a case study.
arXiv Detail & Related papers (2025-04-10T00:53:59Z)
ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation [38.64751082999587]
Large Reasoning Models (LRMs) exhibit remarkable reasoning abilities but rely primarily on parametric knowledge, limiting factual accuracy. We propose ReaRAG, a factuality-enhanced reasoning model that explores diverse queries without excessive iterations. Our study enhances LRMs' factuality while effectively integrating robust reasoning for Retrieval-Augmented Generation (RAG)
arXiv Detail & Related papers (2025-03-27T17:44:18Z)
Can Reasoning Models Reason about Hardware? An Agentic HLS Perspective [18.791753740931185]
OpenAI o3-mini and DeepSeek-R1 use enhanced reasoning through Chain-of-Thought (CoT) This paper investigates whether reasoning LLMs can address challenges in High-Level Synthesis (HLS) design space exploration and optimization.
arXiv Detail & Related papers (2025-03-17T01:21:39Z)
ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning [54.787341008881036]
We introduce Reinforced Meta-thinking Agents (ReMA), a novel framework that leverages Multi-Agent Reinforcement Learning (MARL) to elicit meta-thinking behaviors.<n>ReMA decouples the reasoning process into two hierarchical agents: a high-level meta-thinking agent responsible for generating strategic oversight and plans, and a low-level reasoning agent for detailed executions.<n> Experimental results demonstrate that ReMA outperforms single-agent RL baselines on complex reasoning tasks.
arXiv Detail & Related papers (2025-03-12T16:05:31Z)
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks. LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning. We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z)
DRDT: Dynamic Reflection with Divergent Thinking for LLM-based Sequential Recommendation [53.62727171363384]
We introduce a novel reasoning principle: Dynamic Reflection with Divergent Thinking. Our methodology is dynamic reflection, a process that emulates human learning through probing, critiquing, and reflecting. We evaluate our approach on three datasets using six pre-trained LLMs.
arXiv Detail & Related papers (2023-12-18T16:41:22Z)
Let's reward step by step: Step-Level reward model as the Navigators for Reasoning [64.27898739929734]
Process-Supervised Reward Model (PRM) furnishes LLMs with step-by-step feedback during the training phase. We propose a greedy search algorithm that employs the step-level feedback from PRM to optimize the reasoning pathways explored by LLMs. To explore the versatility of our approach, we develop a novel method to automatically generate step-level reward dataset for coding tasks and observed similar improved performance in the code generation tasks.
arXiv Detail & Related papers (2023-10-16T05:21:50Z)
Exploring Self-supervised Logic-enhanced Training for Large Language Models [59.227222647741094]
In this paper, we make the first attempt to investigate the feasibility of incorporating logical knowledge through self-supervised post-training. We devise an auto-regressive objective variant of MERIt and integrate it with two LLM series, i.e., FLAN-T5 and LLaMA, with parameter size ranging from 3 billion to 13 billion. The results on two challenging logical reasoning benchmarks demonstrate the effectiveness of LogicLLM.
arXiv Detail & Related papers (2023-05-23T06:13:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.