Handling Infinite Domain Parameters in Planning Through Best-First Search with Delayed Partial Expansions
- URL: http://arxiv.org/abs/2509.03953v2
- Date: Mon, 22 Sep 2025 11:45:16 GMT
- Title: Handling Infinite Domain Parameters in Planning Through Best-First Search with Delayed Partial Expansions
- Authors: Ángel Aso-Mollar, Diego Aineto, Enrico Scala, Eva Onaindia,
- Abstract summary: In automated planning, control parameters extend standard action representations through the introduction of continuous numeric decision variables.<n>Existing state-of-the-art approaches have primarily handled control parameters as embedded constraints alongside other temporal and numeric restrictions.<n>We propose an efficient alternative that explicitly handles control parameters as true decision points within a systematic search scheme.
- Score: 11.009425634308043
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In automated planning, control parameters extend standard action representations through the introduction of continuous numeric decision variables. Existing state-of-the-art approaches have primarily handled control parameters as embedded constraints alongside other temporal and numeric restrictions, and thus have implicitly treated them as additional constraints rather than as decision points in the search space. In this paper, we propose an efficient alternative that explicitly handles control parameters as true decision points within a systematic search scheme. We develop a best-first, heuristic search algorithm that operates over infinite decision spaces defined by control parameters and prove a notion of completeness in the limit under certain conditions. Our algorithm leverages the concept of delayed partial expansion, where a state is not fully expanded but instead incrementally expands a subset of its successors. Our results demonstrate that this novel search algorithm is a competitive alternative to existing approaches for solving planning problems involving control parameters.
Related papers
- Context-Sensitive Abstractions for Reinforcement Learning with Parameterized Actions [22.730282038941382]
This paper extends the scope of reinforcement learning algorithms to long-horizon, sparse-reward settings with parameterized actions.<n>We introduce algorithms that progressively refine these abstractions during learning, increasing fine-grained detail in the critical regions of the state-action space.<n>Across several continuous-state, parameterized-action domains, our abstraction-driven approach enables TD($$) to achieve markedly higher sample efficiency than state-of-the-art baselines.
arXiv Detail & Related papers (2025-12-23T23:12:53Z) - Learning Deterministic Policies with Policy Gradients in Constrained Markov Decision Processes [59.27926064817273]
We introduce an exploration-agnostic algorithm, called C-PG, which enjoys global last-iterate convergence guarantees under domination assumptions.<n>We empirically validate both the action-based (C-PGAE) and parameter-based (C-PGPE) variants of C-PG on constrained control tasks.
arXiv Detail & Related papers (2025-06-06T10:29:05Z) - Single-loop Algorithms for Stochastic Non-convex Optimization with Weakly-Convex Constraints [49.76332265680669]
This paper examines a crucial subset of problems where both the objective and constraint functions are weakly convex.<n>Existing methods often face limitations, including slow convergence rates or reliance on double-loop designs.<n>We introduce a novel single-loop penalty-based algorithm to overcome these challenges.
arXiv Detail & Related papers (2025-04-21T17:15:48Z) - Online Cluster-Based Parameter Control for Metaheuristic [0.0]
The present work proposes a general-purpose online parameter-tuning method called Cluster-Based Adaptation (CPA) for population-based metaheuristics.<n>The main idea lies in the identification of promising areas within the parameter search space and in the generation of new parameters around these areas.<n>The obtained results are statistically analyzed and compared with state-of-the-art algorithms, including advanced auto-tuning approaches.
arXiv Detail & Related papers (2025-04-07T14:48:30Z) - Deterministic Policy Gradient Primal-Dual Methods for Continuous-Space Constrained MDPs [82.34567890576423]
We develop a deterministic policy gradient primal-dual method to find an optimal deterministic policy with non-asymptotic convergence.<n>We prove that the primal-dual iterates of D-PGPD converge at a sub-linear rate to an optimal regularized primal-dual pair.<n>This appears to be the first work that proposes a deterministic policy search method for continuous-space constrained MDPs.
arXiv Detail & Related papers (2024-08-19T14:11:04Z) - Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning [62.81324245896717]
We introduce an exploration-agnostic algorithm, called C-PG, which exhibits global last-ite convergence guarantees under (weak) gradient domination assumptions.
We numerically validate our algorithms on constrained control problems, and compare them with state-of-the-art baselines.
arXiv Detail & Related papers (2024-07-15T14:54:57Z) - Belief-State Query Policies for User-Aligned POMDPs [18.821166966365315]
We present a novel framework for expressing users' constraints and preferences about agent behavior in a partially observable setting.<n>We present the first formal analysis of such constraints and prove that while the expected cost function of a parameterized BSQ policy w.r.t its parameters is not convex, it is piecewise constant.<n>This theoretical result leads to novel algorithms that optimize gPOMDP agent behavior with guaranteed user alignment.
arXiv Detail & Related papers (2024-05-24T20:04:51Z) - On the convergence of adaptive first order methods: proximal gradient and alternating minimization algorithms [4.307128674848627]
AdaPG$q,r$ is a framework that unifies and extends existing results by providing larger stepsize policies and improved lower bounds.
Different choices of the parameters $q$ and $r$ are discussed and the efficacy of the resulting methods is demonstrated through numerical simulations.
arXiv Detail & Related papers (2023-11-30T10:29:43Z) - Online Constraint Tightening in Stochastic Model Predictive Control: A
Regression Approach [49.056933332667114]
No analytical solutions exist for chance-constrained optimal control problems.
We propose a data-driven approach for learning the constraint-tightening parameters online during control.
Our approach yields constraint-tightening parameters that tightly satisfy the chance constraints.
arXiv Detail & Related papers (2023-10-04T16:22:02Z) - Online Continuous Hyperparameter Optimization for Generalized Linear Contextual Bandits [55.03293214439741]
In contextual bandits, an agent sequentially makes actions from a time-dependent action set based on past experience.
We propose the first online continuous hyperparameter tuning framework for contextual bandits.
We show that it could achieve a sublinear regret in theory and performs consistently better than all existing methods on both synthetic and real datasets.
arXiv Detail & Related papers (2023-02-18T23:31:20Z) - Stochastic Direct Search Method for Blind Resource Allocation [6.574808513848414]
We study direct search (also known as pattern search) methods for linearly constrained and derivative-free optimization.
We show that direct search methods achieves finite regret in the deterministic and unconstrained case.
We propose a simple extension of direct search that achieves a regret upper-bound of the order of $T2/3$.
arXiv Detail & Related papers (2022-10-11T07:40:45Z) - Goal Kernel Planning: Linearly-Solvable Non-Markovian Policies for Logical Tasks with Goal-Conditioned Options [54.40780660868349]
We introduce a compositional framework called Linearly-Solvable Goal Kernel Dynamic Programming (LS-GKDP)<n>LS-GKDP combines the Linearly-Solvable Markov Decision Process (LMDP) formalism with the Options Framework of Reinforcement Learning.<n>We show how an LMDP with a goal kernel enables the efficient optimization of meta-policies in a lower-dimensional subspace defined by the task grounding.
arXiv Detail & Related papers (2020-07-06T05:13:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.