STAP: Sequencing Task-Agnostic Policies
- URL: http://arxiv.org/abs/2210.12250v3
- Date: Wed, 31 May 2023 10:53:34 GMT
- Title: STAP: Sequencing Task-Agnostic Policies
- Authors: Christopher Agia and Toki Migimatsu and Jiajun Wu and Jeannette Bohg
- Abstract summary: We present Sequencing Task-Agnostic Policies (STAP) for training manipulation skills and coordinating their geometric dependencies at planning time to solve long-horizon tasks.
Our experiments indicate that this objective function approximates ground truth plan feasibility.
We demonstrate how STAP can be used for task and motion planning by estimating the geometric feasibility of skill sequences provided by a task planner.
- Score: 22.25415946972336
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Advances in robotic skill acquisition have made it possible to build
general-purpose libraries of learned skills for downstream manipulation tasks.
However, naively executing these skills one after the other is unlikely to
succeed without accounting for dependencies between actions prevalent in
long-horizon plans. We present Sequencing Task-Agnostic Policies (STAP), a
scalable framework for training manipulation skills and coordinating their
geometric dependencies at planning time to solve long-horizon tasks never seen
by any skill during training. Given that Q-functions encode a measure of skill
feasibility, we formulate an optimization problem to maximize the joint success
of all skills sequenced in a plan, which we estimate by the product of their
Q-values. Our experiments indicate that this objective function approximates
ground truth plan feasibility and, when used as a planning objective, reduces
myopic behavior and thereby promotes long-horizon task success. We further
demonstrate how STAP can be used for task and motion planning by estimating the
geometric feasibility of skill sequences provided by a task planner. We
evaluate our approach in simulation and on a real robot. Qualitative results
and code are made available at https://sites.google.com/stanford.edu/stap.
Related papers
- Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos [48.15438373870542]
VidAssist is an integrated framework designed for zero/few-shot goal-oriented planning in instructional videos.
It employs a breadth-first search algorithm for optimal plan generation.
Experiments demonstrate that VidAssist offers a unified framework for different goal-oriented planning setups.
arXiv Detail & Related papers (2024-09-30T17:57:28Z) - Spatial Reasoning and Planning for Deep Embodied Agents [2.7195102129095003]
This thesis explores the development of data-driven techniques for spatial reasoning and planning tasks.
It focuses on enhancing learning efficiency, interpretability, and transferability across novel scenarios.
arXiv Detail & Related papers (2024-09-28T23:05:56Z) - Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks [50.27313829438866]
Plan-Seq-Learn (PSL) is a modular approach that uses motion planning to bridge the gap between abstract language and learned low-level control.
PSL achieves success rates of over 85%, out-performing language-based, classical, and end-to-end approaches.
arXiv Detail & Related papers (2024-05-02T17:59:31Z) - Efficient Learning of High Level Plans from Play [57.29562823883257]
We present Efficient Learning of High-Level Plans from Play (ELF-P), a framework for robotic learning that bridges motion planning and deep RL.
We demonstrate that ELF-P has significantly better sample efficiency than relevant baselines over multiple realistic manipulation tasks.
arXiv Detail & Related papers (2023-03-16T20:09:47Z) - Reinforcement Learning with Success Induced Task Prioritization [68.8204255655161]
We introduce Success Induced Task Prioritization (SITP), a framework for automatic curriculum learning.
The algorithm selects the order of tasks that provide the fastest learning for agents.
We demonstrate that SITP matches or surpasses the results of other curriculum design methods.
arXiv Detail & Related papers (2022-12-30T12:32:43Z) - POMRL: No-Regret Learning-to-Plan with Increasing Horizons [43.693739167594295]
We study the problem of planning under model uncertainty in an online meta-reinforcement learning setting.
We propose an algorithm to meta-learn the underlying structure across tasks, utilize it to plan in each task, and upper-bound the regret of the planning loss.
arXiv Detail & Related papers (2022-12-30T03:09:45Z) - Active Task Randomization: Learning Robust Skills via Unsupervised
Generation of Diverse and Feasible Tasks [37.73239471412444]
We introduce Active Task Randomization (ATR), an approach that learns robust skills through the unsupervised generation of training tasks.
ATR selects suitable tasks, which consist of an initial environment state and manipulation goal, for learning robust skills by balancing the diversity and feasibility of the tasks.
We demonstrate that the learned skills can be composed by a task planner to solve unseen sequential manipulation problems based on visual inputs.
arXiv Detail & Related papers (2022-11-11T11:24:55Z) - Learning Temporally Extended Skills in Continuous Domains as Symbolic
Actions for Planning [2.642698101441705]
Problems which require both long-horizon planning and continuous control capabilities pose significant challenges to existing reinforcement learning agents.
We introduce a novel hierarchical reinforcement learning agent which links temporally extended skills for continuous control with a forward model in a symbolic abstraction of the environment's state for planning.
arXiv Detail & Related papers (2022-07-11T17:13:10Z) - Hierarchical Few-Shot Imitation with Skill Transition Models [66.81252581083199]
Few-shot Imitation with Skill Transition Models (FIST) is an algorithm that extracts skills from offline data and utilizes them to generalize to unseen tasks.
We show that FIST is capable of generalizing to new tasks and substantially outperforms prior baselines in navigation experiments.
arXiv Detail & Related papers (2021-07-19T15:56:01Z) - Hierarchical Reinforcement Learning as a Model of Human Task
Interleaving [60.95424607008241]
We develop a hierarchical model of supervisory control driven by reinforcement learning.
The model reproduces known empirical effects of task interleaving.
The results support hierarchical RL as a plausible model of task interleaving.
arXiv Detail & Related papers (2020-01-04T17:53:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.