Coarse-to-Fine Q-attention with Learned Path Ranking
- URL: http://arxiv.org/abs/2204.01571v1
- Date: Mon, 4 Apr 2022 15:23:14 GMT
- Title: Coarse-to-Fine Q-attention with Learned Path Ranking
- Authors: Stephen James and Pieter Abbeel
- Abstract summary: We propose Learned Path Ranking (LPR), a method that accepts an end-effector goal pose, and learns to rank a set of goal-reaching paths.
In addition to benchmarking our approach across 16 RLBench tasks, we also learn real-world tasks, tabula rasa, in 10-15 minutes, with only 3 demonstrations.
- Score: 95.00518278458908
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose Learned Path Ranking (LPR), a method that accepts an end-effector
goal pose, and learns to rank a set of goal-reaching paths generated from an
array of path generating methods, including: path planning, Bezier curve
sampling, and a learned policy. The core idea being that each of the path
generation modules will be useful in different tasks, or at different stages in
a task. When LPR is added as an extension to C2F-ARM, our new system,
C2F-ARM+LPR, retains the sample efficiency of its predecessor, while also being
able to accomplish a larger set of tasks; in particular, tasks that require
very specific motions (e.g. opening toilet seat) that need to be inferred from
both demonstrations and exploration data. In addition to benchmarking our
approach across 16 RLBench tasks, we also learn real-world tasks, tabula rasa,
in 10-15 minutes, with only 3 demonstrations.
Related papers
- Adaptive Rentention & Correction for Continual Learning [114.5656325514408]
A common problem in continual learning is the classification layer's bias towards the most recent task.
We name our approach Adaptive Retention & Correction (ARC)
ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets.
arXiv Detail & Related papers (2024-05-23T08:43:09Z) - Reinforcement Learning with Success Induced Task Prioritization [68.8204255655161]
We introduce Success Induced Task Prioritization (SITP), a framework for automatic curriculum learning.
The algorithm selects the order of tasks that provide the fastest learning for agents.
We demonstrate that SITP matches or surpasses the results of other curriculum design methods.
arXiv Detail & Related papers (2022-12-30T12:32:43Z) - CLUTR: Curriculum Learning via Unsupervised Task Representation Learning [130.79246770546413]
CLUTR is a novel curriculum learning algorithm that decouples task representation and curriculum learning into a two-stage optimization.
We show CLUTR outperforms PAIRED, a principled and popular UED method, in terms of generalization and sample efficiency in the challenging CarRacing and navigation environments.
arXiv Detail & Related papers (2022-10-19T01:45:29Z) - C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks [133.40619754674066]
Goal-conditioned reinforcement learning can solve tasks in a wide range of domains, including navigation and manipulation.
We propose the distant goal-reaching task by using search at training time to automatically generate intermediate states.
E-step corresponds to planning an optimal sequence of waypoints using graph search, while the M-step aims to learn a goal-conditioned policy to reach those waypoints.
arXiv Detail & Related papers (2021-10-22T22:05:31Z) - C-Learning: Horizon-Aware Cumulative Accessibility Estimation [29.588146016880284]
We introduce the concept of cumulative accessibility functions, which measure the reachability of a goal from a given state within a specified horizon.
We show that these functions obey a recurrence relation, which enables learning from offline interactions.
We evaluate our approach on a set of multi-goal discrete and continuous control tasks.
arXiv Detail & Related papers (2020-11-24T20:34:31Z) - Attentive Feature Reuse for Multi Task Meta learning [17.8055398673228]
We develop new algorithms for simultaneous learning of multiple tasks.
We propose an attention mechanism to dynamically specialize the network, at runtime, for each task.
Our method improves performance on new, previously unseen environments.
arXiv Detail & Related papers (2020-06-12T19:33:11Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.