Related papers: DynaAct: Large Language Model Reasoning with Dynamic Action Spaces

DynaAct: Large Language Model Reasoning with Dynamic Action Spaces

URL: http://arxiv.org/abs/2511.08043v1
Date: Wed, 12 Nov 2025 01:36:27 GMT
Title: DynaAct: Large Language Model Reasoning with Dynamic Action Spaces
Authors: Xueliang Zhao, Wei Wu, Jian Guan, Qintong Li, Lingpeng Kong,
Abstract summary: We propose a novel framework named textscDynaAct for automatically constructing a compact action space.<n>Our approach significantly improves overall performance, while maintaining efficient inference without introducing substantial latency.
Score: 58.298135359318024
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In modern sequential decision-making systems, the construction of an optimal candidate action space is critical to efficient inference. However, existing approaches either rely on manually defined action spaces that lack scalability or utilize unstructured spaces that render exhaustive search computationally prohibitive. In this paper, we propose a novel framework named \textsc{DynaAct} for automatically constructing a compact action space to enhance sequential reasoning in complex problem-solving scenarios. Our method first estimates a proxy for the complete action space by extracting general sketches observed in a corpus covering diverse complex reasoning problems using large language models. We then formulate a submodular function that jointly evaluates candidate actions based on their utility to the current state and their diversity, and employ a greedy algorithm to select an optimal candidate set. Extensive experiments on six diverse standard benchmarks demonstrate that our approach significantly improves overall performance, while maintaining efficient inference without introducing substantial latency. The implementation is available at https://github.com/zhaoxlpku/DynaAct.

Related papers

Neural Nonmyopic Bayesian Optimization in Dynamic Cost Settings [73.44599934855067]
LookaHES is a nonmyopic BO framework designed for dynamic, history-dependent cost environments.<n>LookaHES combines a multi-step variant of $H$-Entropy Search with pathwise sampling and neural policy optimization.<n>Our innovation is the integration of neural policies, including large language models, to effectively navigate structured, domain-specific action spaces.
arXiv Detail & Related papers (2026-01-10T09:49:45Z)
IG-Pruning: Input-Guided Block Pruning for Large Language Models [34.984986323797976]
We propose IG-Pruning, a novel input-aware block-wise pruning method that dynamically selects layer masks at inference time.<n> Experimental results demonstrate that our method consistently outperforms state-of-the-art static depth pruning methods.
arXiv Detail & Related papers (2025-11-04T03:05:54Z)
Search-Based Robot Motion Planning With Distance-Based Adaptive Motion Primitives [1.8874301050354767]
This work proposes a motion planning algorithm for robotic manipulators that combines sampling-based and search-based planning methods.<n>The core contribution of the proposed approach is the usage of burs of free configuration space (C-space) as adaptive motion primitives.<n>Results demonstrate that the bur-based approach outperforms fixed-primitive planning in complex scenarios.
arXiv Detail & Related papers (2025-07-01T21:33:33Z)
ORPP: Self-Optimizing Role-playing Prompts to Enhance Language Model Capabilities [64.24517317344959]
High-quality prompts are crucial for eliciting outstanding performance from large language models on complex tasks.<n>We propose ORPP, a framework that enhances model performance by optimizing and generating role-playing prompts.<n>We show that ORPP not only matches but in most cases surpasses existing mainstream prompt optimization methods in terms of performance.
arXiv Detail & Related papers (2025-06-03T05:51:35Z)
Stop Relying on No-Choice and Do not Repeat the Moves: Optimal, Efficient and Practical Algorithms for Assortment Optimization [38.57171985309975]
We develop efficient algorithms for the problem of regret in assortment selection with emphPlackett Luce (PL) based user choices. Our methods are practical, provably optimal, and devoid of the aforementioned limitations of the existing methods.
arXiv Detail & Related papers (2024-02-29T07:17:04Z)
Large Language Models to Enhance Bayesian Optimization [57.474613739645605]
We present LLAMBO, a novel approach that integrates the capabilities of Large Language Models (LLM) within Bayesian optimization. At a high level, we frame the BO problem in natural language, enabling LLMs to iteratively propose and evaluate promising solutions conditioned on historical evaluations. Our findings illustrate that LLAMBO is effective at zero-shot warmstarting, and enhances surrogate modeling and candidate sampling, especially in the early stages of search when observations are sparse.
arXiv Detail & Related papers (2024-02-06T11:44:06Z)
AI planning in the imagination: High-level planning on learned abstract search spaces [68.75684174531962]
We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training. We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman.
arXiv Detail & Related papers (2023-08-16T22:47:16Z)
Bayesian Optimization over High-Dimensional Combinatorial Spaces via Dictionary-based Embeddings [36.60636056219264]
We consider the problem of optimizing black-box functions over high-dimensional spaces in science, engineering, and ML applications. Key idea is to select a number of discrete structures from the input space and use them to define an ordinal embedding for high-dimensional structures. We develop a principled approach based on binary wavelets to construct dictionaries for binary spaces, and propose a randomized construction method that generalizes to categorical spaces.
arXiv Detail & Related papers (2023-03-03T08:31:42Z)
Contextual Bandits with Large Action Spaces: Made Practical [48.28690486203131]
We present the first efficient, general-purpose algorithm for contextual bandits with continuous, linearly structured action spaces. Our algorithm makes use of computational oracles for supervised learning, and (ii) optimization over the action space, and achieves sample complexity, runtime, and memory independent of the size of the action space.
arXiv Detail & Related papers (2022-07-12T21:01:48Z)
Learning Salient Boundary Feature for Anchor-free Temporal Action Localization [81.55295042558409]
Temporal action localization is an important yet challenging task in video understanding. We propose the first purely anchor-free temporal localization method. Our model includes (i) an end-to-end trainable basic predictor, (ii) a saliency-based refinement module, and (iii) several consistency constraints.
arXiv Detail & Related papers (2021-03-24T12:28:32Z)
Goal Kernel Planning: Linearly-Solvable Non-Markovian Policies for Logical Tasks with Goal-Conditioned Options [54.40780660868349]
We introduce a compositional framework called Linearly-Solvable Goal Kernel Dynamic Programming (LS-GKDP)<n>LS-GKDP combines the Linearly-Solvable Markov Decision Process (LMDP) formalism with the Options Framework of Reinforcement Learning.<n>We show how an LMDP with a goal kernel enables the efficient optimization of meta-policies in a lower-dimensional subspace defined by the task grounding.
arXiv Detail & Related papers (2020-07-06T05:13:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.