Learning Space Partitions for Path Planning
- URL: http://arxiv.org/abs/2106.10544v1
- Date: Sat, 19 Jun 2021 18:06:11 GMT
- Title: Learning Space Partitions for Path Planning
- Authors: Kevin Yang, Tianjun Zhang, Chris Cummins, Brandon Cui, Benoit Steiner,
Linnan Wang, Joseph E. Gonzalez, Dan Klein, Yuandong Tian
- Abstract summary: PlaLaM outperforms existing path planning methods in 2D navigation tasks, especially in the presence of difficult-to-escape local optima.
These gains transfer to highly multimodal real-world tasks, where we outperform strong baselines in compiler phase ordering by up to 245% and in molecular design by up to 0.4 on properties on a 0-1 scale.
- Score: 54.475949279050596
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Path planning, the problem of efficiently discovering high-reward
trajectories, often requires optimizing a high-dimensional and multimodal
reward function. Popular approaches like CEM and CMA-ES greedily focus on
promising regions of the search space and may get trapped in local maxima. DOO
and VOOT balance exploration and exploitation, but use space partitioning
strategies independent of the reward function to be optimized. Recently, LaMCTS
empirically learns to partition the search space in a reward-sensitive manner
for black-box optimization. In this paper, we develop a novel formal regret
analysis for when and why such an adaptive region partitioning scheme works. We
also propose a new path planning method PlaLaM which improves the function
value estimation within each sub-region, and uses a latent representation of
the search space. Empirically, PlaLaM outperforms existing path planning
methods in 2D navigation tasks, especially in the presence of
difficult-to-escape local optima, and shows benefits when plugged into
model-based RL with planning components such as PETS. These gains transfer to
highly multimodal real-world tasks, where we outperform strong baselines in
compiler phase ordering by up to 245% and in molecular design by up to 0.4 on
properties on a 0-1 scale.
Related papers
- LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning [91.95362946266577]
Path planning is a fundamental scientific problem in robotics and autonomous navigation.
Traditional algorithms like A* and its variants are capable of ensuring path validity but suffer from significant computational and memory inefficiencies as the state space grows.
We propose a new LLM based route planning method that synergistically combines the precise pathfinding capabilities of A* with the global reasoning capability of LLMs.
This hybrid approach aims to enhance pathfinding efficiency in terms of time and space complexity while maintaining the integrity of path validity, especially in large-scale scenarios.
arXiv Detail & Related papers (2024-06-20T01:24:30Z) - NNPP: A Learning-Based Heuristic Model for Accelerating Optimal Path Planning on Uneven Terrain [5.337162499594818]
We propose the NNPP model for computing the region, enabling foundation algorithms like Astar to find the optimal path solely within this reduced search space.
The NNPP model learns information about start and goal locations, as well as map representations, from numerous pre-annotated optimal path demonstrations.
It is able to textcolorrevisionaccelerate path planning on novel maps.
arXiv Detail & Related papers (2023-08-09T08:31:05Z) - Learning Regions of Interest for Bayesian Optimization with Adaptive
Level-Set Estimation [84.0621253654014]
We propose a framework, called BALLET, which adaptively filters for a high-confidence region of interest.
We show theoretically that BALLET can efficiently shrink the search space, and can exhibit a tighter regret bound than standard BO.
arXiv Detail & Related papers (2023-07-25T09:45:47Z) - Maximize to Explore: One Objective Function Fusing Estimation, Planning,
and Exploration [87.53543137162488]
We propose an easy-to-implement online reinforcement learning (online RL) framework called textttMEX.
textttMEX integrates estimation and planning components while balancing exploration exploitation automatically.
It can outperform baselines by a stable margin in various MuJoCo environments with sparse rewards.
arXiv Detail & Related papers (2023-05-29T17:25:26Z) - DDPEN: Trajectory Optimisation With Sub Goal Generation Model [70.36888514074022]
In this paper, we produce a novel Differential Dynamic Programming with Escape Network (DDPEN)
We propose to utilize a deep model that takes as an input map of the environment in the form of a costmap together with the desired position.
The model produces possible future directions that will lead to the goal, avoiding local minima which is possible to run in real time conditions.
arXiv Detail & Related papers (2023-01-18T11:02:06Z) - Adaptive Discretization using Voronoi Trees for Continuous-Action POMDPs [7.713622698801596]
We propose a new sampling-based online POMDP solver, called Adaptive Discretization using Voronoi Trees (ADVT)
ADVT uses Monte Carlo Tree Search in combination with an adaptive discretization of the action space as well as optimistic optimization.
Experiments on simulations of four types of benchmark problems indicate that ADVT outperforms and scales substantially better to high-dimensional continuous action spaces.
arXiv Detail & Related papers (2022-09-13T05:04:49Z) - Multi-objective Optimization by Learning Space Partitions [34.84370438997276]
We propose LaMOO, a novel multi-objective that learns a model from observed samples to partition the search space and then focus on promising regions.
LaMOO substantially outperforms strong baselines on multiple real-world MOO tasks.
arXiv Detail & Related papers (2021-10-07T03:56:19Z) - Localized active learning of Gaussian process state space models [63.97366815968177]
A globally accurate model is not required to achieve good performance in many common control applications.
We propose an active learning strategy for Gaussian process state space models that aims to obtain an accurate model on a bounded subset of the state-action space.
By employing model predictive control, the proposed technique integrates information collected during exploration and adaptively improves its exploration strategy.
arXiv Detail & Related papers (2020-05-04T05:35:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.