Skill Reinforcement Learning and Planning for Open-World Long-Horizon
Tasks
- URL: http://arxiv.org/abs/2303.16563v2
- Date: Mon, 4 Dec 2023 14:53:15 GMT
- Title: Skill Reinforcement Learning and Planning for Open-World Long-Horizon
Tasks
- Authors: Haoqi Yuan, Chi Zhang, Hongcheng Wang, Feiyang Xie, Penglin Cai, Hao
Dong, Zongqing Lu
- Abstract summary: We study building multi-task agents in open-world environments.
We convert the multi-task learning problem into learning basic skills and planning over the skills.
Our method accomplishes 40 diverse Minecraft tasks, where many tasks require sequentially executing for more than 10 skills.
- Score: 31.084848672383185
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study building multi-task agents in open-world environments. Without human
demonstrations, learning to accomplish long-horizon tasks in a large open-world
environment with reinforcement learning (RL) is extremely inefficient. To
tackle this challenge, we convert the multi-task learning problem into learning
basic skills and planning over the skills. Using the popular open-world game
Minecraft as the testbed, we propose three types of fine-grained basic skills,
and use RL with intrinsic rewards to acquire skills. A novel Finding-skill that
performs exploration to find diverse items provides better initialization for
other skills, improving the sample efficiency for skill learning. In skill
planning, we leverage the prior knowledge in Large Language Models to find the
relationships between skills and build a skill graph. When the agent is solving
a task, our skill search algorithm walks on the skill graph and generates the
proper skill plans for the agent. In experiments, our method accomplishes 40
diverse Minecraft tasks, where many tasks require sequentially executing for
more than 10 skills. Our method outperforms baselines by a large margin and is
the most sample-efficient demonstration-free RL method to solve Minecraft Tech
Tree tasks. The project's website and code can be found at
https://sites.google.com/view/plan4mc.
Related papers
- SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation [58.14969377419633]
We propose spire, a system that decomposes tasks into smaller learning subproblems and second combines imitation and reinforcement learning to maximize their strengths.
We find that spire outperforms prior approaches that integrate imitation learning, reinforcement learning, and planning by 35% to 50% in average task performance.
arXiv Detail & Related papers (2024-10-23T17:42:07Z) - Agentic Skill Discovery [19.5703917813767]
Language-conditioned robotic skills make it possible to apply the high-level reasoning of Large Language Models (LLMs) to low-level robotic control.
A remaining challenge is to acquire a diverse set of fundamental skills.
We introduce a novel framework for skill discovery that is entirely driven by LLMs.
arXiv Detail & Related papers (2024-05-23T19:44:03Z) - Choreographer: Learning and Adapting Skills in Imagination [60.09911483010824]
We present Choreographer, a model-based agent that exploits its world model to learn and adapt skills in imagination.
Our method decouples the exploration and skill learning processes, being able to discover skills in the latent state space of the model.
Choreographer is able to learn skills both from offline data, and by collecting data simultaneously with an exploration policy.
arXiv Detail & Related papers (2022-11-23T23:31:14Z) - Residual Skill Policies: Learning an Adaptable Skill-based Action Space
for Reinforcement Learning for Robotics [18.546688182454236]
Skill-based reinforcement learning (RL) has emerged as a promising strategy to leverage prior knowledge for accelerated robot learning.
We propose accelerating exploration in the skill space using state-conditioned generative models.
We validate our approach across four challenging manipulation tasks, demonstrating our ability to learn across task variations.
arXiv Detail & Related papers (2022-11-04T02:42:17Z) - Lipschitz-constrained Unsupervised Skill Discovery [91.51219447057817]
Lipschitz-constrained Skill Discovery (LSD) encourages the agent to discover more diverse, dynamic, and far-reaching skills.
LSD outperforms previous approaches in terms of skill diversity, state space coverage, and performance on seven downstream tasks.
arXiv Detail & Related papers (2022-02-02T08:29:04Z) - Example-Driven Model-Based Reinforcement Learning for Solving
Long-Horizon Visuomotor Tasks [85.56153200251713]
We introduce EMBR, a model-based RL method for learning primitive skills that are suitable for completing long-horizon visuomotor tasks.
On a Franka Emika robot arm, we find that EMBR enables the robot to complete three long-horizon visuomotor tasks at 85% success rate.
arXiv Detail & Related papers (2021-09-21T16:48:07Z) - Multi-task curriculum learning in a complex, visual, hard-exploration
domain: Minecraft [18.845438529816004]
We explore curriculum learning in a complex, visual domain with many hard exploration challenges: Minecraft.
We find that learning progress is a reliable measure of learnability for automatically constructing an effective curriculum.
arXiv Detail & Related papers (2021-06-28T17:50:40Z) - Discovering Generalizable Skills via Automated Generation of Diverse
Tasks [82.16392072211337]
We propose a method to discover generalizable skills via automated generation of a diverse set of tasks.
As opposed to prior work on unsupervised discovery of skills, our method pairs each skill with a unique task produced by a trainable task generator.
A task discriminator defined on the robot behaviors in the generated tasks is jointly trained to estimate the evidence lower bound of the diversity objective.
The learned skills can then be composed in a hierarchical reinforcement learning algorithm to solve unseen target tasks.
arXiv Detail & Related papers (2021-06-26T03:41:51Z) - Accelerating Reinforcement Learning with Learned Skill Priors [20.268358783821487]
Most modern reinforcement learning approaches learn every task from scratch.
One approach for leveraging prior knowledge is to transfer skills learned on prior tasks to the new task.
We show that learned skill priors are essential for effective skill transfer from rich datasets.
arXiv Detail & Related papers (2020-10-22T17:59:51Z) - Emergent Real-World Robotic Skills via Unsupervised Off-Policy
Reinforcement Learning [81.12201426668894]
We develop efficient reinforcement learning methods that acquire diverse skills without any reward function, and then repurpose these skills for downstream tasks.
We show that our proposed algorithm provides substantial improvement in learning efficiency, making reward-free real-world training feasible.
We also demonstrate that the learned skills can be composed using model predictive control for goal-oriented navigation, without any additional training.
arXiv Detail & Related papers (2020-04-27T17:38:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.