Hierarchical Width-Based Planning and Learning
- URL: http://arxiv.org/abs/2101.06177v2
- Date: Tue, 23 Mar 2021 15:42:37 GMT
- Title: Hierarchical Width-Based Planning and Learning
- Authors: Miquel Junyent, Vicen\c{c} G\'omez, Anders Jonsson
- Abstract summary: Width-based search methods have demonstrated state-of-the-art performance in a wide range of testbeds.
We present a hierarchical algorithm that plans at two levels of abstraction.
We show how in combination with a learned policy and a learned value function, the proposed hierarchical IW can outperform current flat IW-based planners in Atari games with sparse rewards.
- Score: 8.776765645845012
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Width-based search methods have demonstrated state-of-the-art performance in
a wide range of testbeds, from classical planning problems to image-based
simulators such as Atari games. These methods scale independently of the size
of the state-space, but exponentially in the problem width. In practice,
running the algorithm with a width larger than 1 is computationally
intractable, prohibiting IW from solving higher width problems. In this paper,
we present a hierarchical algorithm that plans at two levels of abstraction. A
high-level planner uses abstract features that are incrementally discovered
from low-level pruning decisions. We illustrate this algorithm in classical
planning PDDL domains as well as in pixel-based simulator domains. In classical
planning, we show how IW(1) at two levels of abstraction can solve problems of
width 2. For pixel-based domains, we show how in combination with a learned
policy and a learned value function, the proposed hierarchical IW can
outperform current flat IW-based planners in Atari games with sparse rewards.
Related papers
- N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields [112.02885337510716]
Nested Neural Feature Fields (N2F2) is a novel approach that employs hierarchical supervision to learn a single feature field.
We leverage a 2D class-agnostic segmentation model to provide semantically meaningful pixel groupings at arbitrary scales in the image space.
Our approach outperforms the state-of-the-art feature field distillation methods on tasks such as open-vocabulary 3D segmentation and localization.
arXiv Detail & Related papers (2024-03-16T18:50:44Z) - Look at the Neighbor: Distortion-aware Unsupervised Domain Adaptation
for Panoramic Semantic Segmentation [5.352137021024213]
The aim is to tackle the domain gaps caused by the style disparities and distortion problem from the non-uniformly distributed pixels of equirectangular projection (ERP)
We propose a novel UDA framework that can effectively address the distortion problems for panoramic semantic segmentation.
arXiv Detail & Related papers (2023-08-10T10:47:12Z) - Representation Learning for General-sum Low-rank Markov Games [63.119870889883224]
We study multi-agent general-sum Markov games with nonlinear function approximation.
We focus on low-rank Markov games whose transition matrix admits a hidden low-rank structure on top of an unknown non-linear representation.
arXiv Detail & Related papers (2022-10-30T22:58:22Z) - Pruning-as-Search: Efficient Neural Architecture Search via Channel
Pruning and Structural Reparameterization [50.50023451369742]
Pruning-as-Search (PaS) is an end-to-end channel pruning method to search out desired sub-network automatically and efficiently.
Our proposed architecture outperforms prior arts by around $1.0%$ top-1 accuracy on ImageNet-1000 classification task.
arXiv Detail & Related papers (2022-06-02T17:58:54Z) - Multi-level Second-order Few-shot Learning [111.0648869396828]
We propose a Multi-level Second-order (MlSo) few-shot learning network for supervised or unsupervised few-shot image classification and few-shot action recognition.
We leverage so-called power-normalized second-order base learner streams combined with features that express multiple levels of visual abstraction.
We demonstrate respectable results on standard datasets such as Omniglot, mini-ImageNet, tiered-ImageNet, Open MIC, fine-grained datasets such as CUB Birds, Stanford Dogs and Cars, and action recognition datasets such as HMDB51, UCF101, and mini-MIT.
arXiv Detail & Related papers (2022-01-15T19:49:00Z) - Classical Planning in Deep Latent Space [33.06766829037679]
Latplan is an unsupervised architecture combining deep learning and classical planning.
Latplan finds a plan to the goal state in a symbolic latent space and returns a visualized plan execution.
arXiv Detail & Related papers (2021-06-30T21:31:21Z) - Width-based Lookaheads with Learnt Base Policies and Heuristics Over the
Atari-2600 Benchmark [4.559353193715442]
We show that RIW$_C$+CPV outperforms $pi$-IW, $pi$-IW(1)+ and $pi$-HIW(n, 1).
We also present a taxonomy of the set of Atari- 2600 games according to some of their defining characteristics.
arXiv Detail & Related papers (2021-06-23T04:27:55Z) - Planning for Novelty: Width-Based Algorithms for Common Problems in
Control, Planning and Reinforcement Learning [6.053629733936546]
Width-based algorithms search for solutions through a general definition of state novelty.
These algorithms have been shown to result in state-of-the-art performance in classical planning.
arXiv Detail & Related papers (2021-06-09T07:46:19Z) - Waypoint Planning Networks [66.72790309889432]
We propose a hybrid algorithm based on LSTMs with a local kernel - a classic algorithm such as A*, and a global kernel using a learned algorithm.
We compare WPN against A*, as well as related works including motion planning networks (MPNet) and value networks (VIN)
It is shown that WPN's search space is considerably less than A*, while being able to generate near optimal results.
arXiv Detail & Related papers (2021-05-01T18:02:01Z) - General Policies, Serializations, and Planning Width [22.112881443209726]
We show that bounded width is a property of planning domains that admit optimal general policies in terms of features that are explicitly or implicitly represented in the domain encoding.
The study leads also to a new simple, meaningful, and expressive language for specifying domain serializations in the form of policy sketches.
arXiv Detail & Related papers (2020-12-15T01:33:59Z) - MPLP++: Fast, Parallel Dual Block-Coordinate Ascent for Dense Graphical
Models [96.1052289276254]
This work introduces a new MAP-solver, based on the popular Dual Block-Coordinate Ascent principle.
Surprisingly, by making a small change to the low-performing solver, we derive the new solver MPLP++ that significantly outperforms all existing solvers by a large margin.
arXiv Detail & Related papers (2020-04-16T16:20:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.