Related papers: Jump Operator Planning: Goal-Conditioned Policy Ensembles and Zero-Shot Transfer

Jump Operator Planning: Goal-Conditioned Policy Ensembles and Zero-Shot Transfer

URL: http://arxiv.org/abs/2007.02527v1
Date: Mon, 6 Jul 2020 05:13:20 GMT
Title: Jump Operator Planning: Goal-Conditioned Policy Ensembles and Zero-Shot Transfer
Authors: Thomas J. Ringstrom, Mohammadhosein Hasanbeig, Alessandro Abate
Abstract summary: We propose a novel framework called Jump-Operator Dynamic Programming for quickly computing solutions within a super-exponential space of sequential sub-goal tasks. This approach involves controlling over an ensemble of reusable goal-conditioned polices functioning as temporally extended actions. We then identify classes of objective functions on this subspace whose solutions are invariant to the grounding, resulting in optimal zero-shot transfer.
Score: 71.44215606325005
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In Hierarchical Control, compositionality, abstraction, and task-transfer are crucial for designing versatile algorithms which can solve a variety of problems with maximal representational reuse. We propose a novel hierarchical and compositional framework called Jump-Operator Dynamic Programming for quickly computing solutions within a super-exponential space of sequential sub-goal tasks with ordering constraints, while also providing a fast linearly-solvable algorithm as an implementation. This approach involves controlling over an ensemble of reusable goal-conditioned polices functioning as temporally extended actions, and utilizes transition operators called feasibility functions, which are used to summarize initial-to-final state dynamics of the polices. Consequently, the added complexity of grounding a high-level task space onto a larger ambient state-space can be mitigated by optimizing in a lower-dimensional subspace defined by the grounding, substantially improving the scalability of the algorithm while effecting transferable solutions. We then identify classes of objective functions on this subspace whose solutions are invariant to the grounding, resulting in optimal zero-shot transfer.

Related papers

A Unified Theory of Compositionality, Modularity, and Interpretability in Markov Decision Processes [1.3044677039636754]
We introduce Option Kernel Bellman Equations (OKBEs) for a new reward-free Markov Decision Process.<n>OKBEs directly construct and optimize a predictive map called a state-time option kernel (STOK) to maximize the probability of completing a goal.<n>We argue that reward-maximization is in conflict with the properties of compositionality, modularity, and interpretability.
arXiv Detail & Related papers (2025-06-11T08:21:22Z)
Reinforcement learning with combinatorial actions for coupled restless bandits [62.89013331120493]
We propose SEQUOIA, an RL algorithm that directly optimize for long-term reward over the feasible action space. We empirically validate SEQUOIA on four novel restless bandit problems with constraints: multiple interventions, path constraints, bipartite matching, and capacity constraints.
arXiv Detail & Related papers (2025-03-01T21:25:21Z)
Interactive and Expressive Code-Augmented Planning with Large Language Models [62.799579304821826]
Large Language Models (LLMs) demonstrate strong abilities in common-sense reasoning and interactive decision-making. Recent techniques have sought to structure LLM outputs using control flow and other code-adjacent techniques to improve planning performance. We propose REPL-Plan, an LLM planning approach that is fully code-expressive and dynamic.
arXiv Detail & Related papers (2024-11-21T04:23:17Z)
Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning [94.76546523689113]
We introduce CodePlan, a framework that generates and follows textcode-form plans -- pseudocode that outlines high-level, structured reasoning processes. CodePlan effectively captures the rich semantics and control flows inherent to sophisticated reasoning tasks. It achieves a 25.1% relative improvement compared with directly generating responses.
arXiv Detail & Related papers (2024-09-19T04:13:58Z)
A Primal-Dual-Assisted Penalty Approach to Bilevel Optimization with Coupled Constraints [66.61399765513383]
We develop a BLOCC algorithm to tackle BiLevel Optimization problems with Coupled Constraints. We demonstrate its effectiveness on two well-known real-world applications.
arXiv Detail & Related papers (2024-06-14T15:59:36Z)
Unified Task and Motion Planning using Object-centric Abstractions of Motion Constraints [56.283944756315066]
We propose an alternative TAMP approach that unifies task and motion planning into a single search. Our approach is based on an object-centric abstraction of motion constraints that permits leveraging the computational efficiency of off-the-shelf AI search to yield physically feasible plans.
arXiv Detail & Related papers (2023-12-29T14:00:20Z)
Dynamic Planning with a LLM [15.430182858130884]
Large Language Models (LLMs) can solve many NLP tasks in zero-shot settings, but applications involving embodied agents remain problematic. Our work presents LLM Dynamic Planner (LLM-DP), a neuro-symbolic framework where an LLM works hand-in-hand with a traditional planner to solve an embodied task.
arXiv Detail & Related papers (2023-08-11T21:17:13Z)
On efficient computation in active inference [1.1470070927586016]
We present a novel planning algorithm for finite temporal horizons with drastically lower computational complexity. We also simplify the process of setting an appropriate target distribution for new and existing active inference planning schemes.
arXiv Detail & Related papers (2023-07-02T07:38:56Z)
Model-free Motion Planning of Autonomous Agents for Complex Tasks in Partially Observable Environments [3.7660066212240753]
Motion planning of autonomous agents in partially known environments is a challenging problem. This paper proposes a model-free reinforcement learning approach to address this problem. We show that our proposed method effectively addresses environment, action, and observation uncertainties.
arXiv Detail & Related papers (2023-04-30T19:57:39Z)
Efficient Planning in Combinatorial Action Spaces with Applications to Cooperative Multi-Agent Reinforcement Learning [16.844525262228103]
In cooperative multi-agent reinforcement learning, a potentially large number of agents jointly optimize a global reward function, which leads to a blow-up in the action space by the number of agents. As a minimal requirement, we assume access to an argmax oracle that allows to efficiently compute the greedy policy for any Q-function in the model class. We propose efficient algorithms for this setting that lead to compute and query complexity in all relevant problem parameters.
arXiv Detail & Related papers (2023-02-08T23:42:49Z)
Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms. An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z)
Multi-Objective Policy Gradients with Topological Constraints [108.10241442630289]
We present a new algorithm for a policy gradient in TMDPs by a simple extension of the proximal policy optimization (PPO) algorithm. We demonstrate this on a real-world multiple-objective navigation problem with an arbitrary ordering of objectives both in simulation and on a real robot.
arXiv Detail & Related papers (2022-09-15T07:22:58Z)
Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments. To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command. We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z)
A Globally Convergent Evolutionary Strategy for Stochastic Constrained Optimization with Applications to Reinforcement Learning [0.6445605125467573]
Evolutionary strategies have been shown to achieve competing levels of performance for complex optimization problems in reinforcement learning. Convergence guarantees for evolutionary strategies to optimize constrained problems are however lacking in the literature.
arXiv Detail & Related papers (2022-02-21T17:04:51Z)
On Constraints in First-Order Optimization: A View from Non-Smooth Dynamical Systems [99.59934203759754]
We introduce a class of first-order methods for smooth constrained optimization. Two distinctive features of our approach are that projections or optimizations over the entire feasible set are avoided. The resulting algorithmic procedure is simple to implement even when constraints are nonlinear.
arXiv Detail & Related papers (2021-07-17T11:45:13Z)
Extended Task and Motion Planning of Long-horizon Robot Manipulation [28.951816622135922]
Task and Motion Planning (TAMP) requires integration of symbolic reasoning with metric motion planning. Most TAMP approaches fail to provide feasible solutions when there is missing knowledge about the environment at the symbolic level. We propose a novel approach for decision-making on extended decision spaces over plan skeletons and action parameters.
arXiv Detail & Related papers (2021-03-09T14:44:08Z)
MPC-MPNet: Model-Predictive Motion Planning Networks for Fast, Near-Optimal Planning under Kinodynamic Constraints [15.608546987158613]
Kinodynamic Motion Planning (KMP) is computation to find a robot motion subject to concurrent kinematics and dynamics constraints. We present a scalable, imitation learning-based, Model-Predictive Motion Planning Networks framework that finds near-optimal path solutions. We evaluate our algorithms on a range of cluttered, kinodynamically constrained, and underactuated planning problems with results indicating significant improvements in times, path qualities, and success rates over existing methods.
arXiv Detail & Related papers (2021-01-17T23:07:04Z)
Planning with Submodular Objective Functions [118.0376288522372]
We study planning with submodular objective functions, where instead of maximizing cumulative reward, the goal is to maximize the value induced by a submodular function. Our framework subsumes standard planning and submodular objective with cardinality constraints as special cases.
arXiv Detail & Related papers (2020-10-22T16:55:12Z)
Constrained Combinatorial Optimization with Reinforcement Learning [0.30938904602244344]
This paper presents a framework to tackle constrained optimization problems using deep Reinforcement Learning (RL) We extend the Neural Combinatorial Optimization (NCO) theory in order to deal with constraints in its formulation. In that context, the solution is iteratively constructed based on interactions with the environment.
arXiv Detail & Related papers (2020-06-22T03:13:07Z)
A Novel Multi-Agent System for Complex Scheduling Problems [2.294014185517203]
This paper is the conception and implementation of a multi-agent system that is applicable in various problem domains. We simulate a NP-hard scheduling problem to demonstrate the validity of our approach. This paper highlights the advantages of the agent-based approach, like the reduction in layout complexity, improved control of complicated systems, and extendability.
arXiv Detail & Related papers (2020-04-20T14:04:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.