Planning Immediate Landmarks of Targets for Model-Free Skill Transfer
across Agents
- URL: http://arxiv.org/abs/2212.09033v1
- Date: Sun, 18 Dec 2022 08:03:21 GMT
- Title: Planning Immediate Landmarks of Targets for Model-Free Skill Transfer
across Agents
- Authors: Minghuan Liu, Zhengbang Zhu, Menghui Zhu, Yuzheng Zhuang, Weinan
Zhang, Jianye Hao
- Abstract summary: We propose PILoT, i.e., Planning Immediate Landmarks of Targets.
PILoT learns a goal-conditioned state planner and distills a goal-planner to plan immediate landmarks in a model-free style.
We show the power of PILoT on various transferring challenges, including few-shot transferring across action spaces and dynamics.
- Score: 34.56191646231944
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In reinforcement learning applications like robotics, agents usually need to
deal with various input/output features when specified with different
state/action spaces by their developers or physical restrictions. This
indicates unnecessary re-training from scratch and considerable sample
inefficiency, especially when agents follow similar solution steps to achieve
tasks. In this paper, we aim to transfer similar high-level goal-transition
knowledge to alleviate the challenge. Specifically, we propose PILoT, i.e.,
Planning Immediate Landmarks of Targets. PILoT utilizes the universal decoupled
policy optimization to learn a goal-conditioned state planner; then, distills a
goal-planner to plan immediate landmarks in a model-free style that can be
shared among different agents. In our experiments, we show the power of PILoT
on various transferring challenges, including few-shot transferring across
action spaces and dynamics, from low-dimensional vector states to image inputs,
from simple robot to complicated morphology; and we also illustrate a zero-shot
transfer solution from a simple 2D navigation task to the harder Ant-Maze task.
Related papers
- A Meta-Engine Framework for Interleaved Task and Motion Planning using Topological Refinements [51.54559117314768]
Task And Motion Planning (TAMP) is the problem of finding a solution to an automated planning problem.
We propose a general and open-source framework for modeling and benchmarking TAMP problems.
We introduce an innovative meta-technique to solve TAMP problems involving moving agents and multiple task-state-dependent obstacles.
arXiv Detail & Related papers (2024-08-11T14:57:57Z) - RT-Trajectory: Robotic Task Generalization via Hindsight Trajectory
Sketches [74.300116260004]
Generalization remains one of the most important desiderata for robust robot learning systems.
We propose a policy conditioning method using rough trajectory sketches.
We show that RT-Trajectory is able to perform a wider range of tasks compared to language-conditioned and goal-conditioned policies.
arXiv Detail & Related papers (2023-11-03T15:31:51Z) - Efficient Skill Acquisition for Complex Manipulation Tasks in Obstructed
Environments [18.348489257164356]
We propose a system for efficient skill acquisition that leverages an object-centric generative model (OCGM) for versatile goal identification.
OCGM enables one-shot target object identification and re-identification in new scenes, allowing MP to guide the robot to the target object while avoiding obstacles.
arXiv Detail & Related papers (2023-03-06T18:49:59Z) - Long-HOT: A Modular Hierarchical Approach for Long-Horizon Object
Transport [83.06265788137443]
We address key challenges in long-horizon embodied exploration and navigation by proposing a new object transport task and a novel modular framework for temporally extended navigation.
Our first contribution is the design of a novel Long-HOT environment focused on deep exploration and long-horizon planning.
We propose a modular hierarchical transport policy (HTP) that builds a topological graph of the scene to perform exploration with the help of weighted frontiers.
arXiv Detail & Related papers (2022-10-28T05:30:49Z) - Learning Efficient Abstract Planning Models that Choose What to Predict [28.013014215441505]
We show that existing symbolic operator learning approaches fall short in many robotics domains.
This is primarily because they attempt to learn operators that exactly predict all observed changes in the abstract state.
We propose to learn operators that 'choose what to predict' by only modelling changes necessary for abstract planning to achieve specified goals.
arXiv Detail & Related papers (2022-08-16T13:12:59Z) - Diversity-based Trajectory and Goal Selection with Hindsight Experience
Replay [8.259694128526112]
We propose diversity-based trajectory and goal selection with HER (DTGSH)
We show that our method can learn more quickly and reach higher performance than other state-of-the-art approaches on all tasks.
arXiv Detail & Related papers (2021-08-17T21:34:24Z) - Learning to Shift Attention for Motion Generation [55.61994201686024]
One challenge of motion generation using robot learning from demonstration techniques is that human demonstrations follow a distribution with multiple modes for one task query.
Previous approaches fail to capture all modes or tend to average modes of the demonstrations and thus generate invalid trajectories.
We propose a motion generation model with extrapolation ability to overcome this problem.
arXiv Detail & Related papers (2021-02-24T09:07:52Z) - ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for
Mobile Manipulation [99.2543521972137]
ReLMoGen is a framework that combines a learned policy to predict subgoals and a motion generator to plan and execute the motion needed to reach these subgoals.
Our method is benchmarked on a diverse set of seven robotics tasks in photo-realistic simulation environments.
ReLMoGen shows outstanding transferability between different motion generators at test time, indicating a great potential to transfer to real robots.
arXiv Detail & Related papers (2020-08-18T08:05:15Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.