Related papers: Deep Hierarchical Planning from Pixels

Deep Hierarchical Planning from Pixels

URL: http://arxiv.org/abs/2206.04114v1
Date: Wed, 8 Jun 2022 18:20:15 GMT
Title: Deep Hierarchical Planning from Pixels
Authors: Danijar Hafner, Kuang-Huei Lee, Ian Fischer, Pieter Abbeel
Abstract summary: Director is a method for learning hierarchical behaviors directly from pixels by planning inside the latent space of a learned world model. Despite operating in latent space, the decisions are interpretable because the world model can decode goals into images for visualization. Director also learns successful behaviors across a wide range of environments, including visual control, Atari games, and DMLab levels.
Score: 86.14687388689204
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Intelligent agents need to select long sequences of actions to solve complex tasks. While humans easily break down tasks into subgoals and reach them through millions of muscle commands, current artificial intelligence is limited to tasks with horizons of a few hundred decisions, despite large compute budgets. Research on hierarchical reinforcement learning aims to overcome this limitation but has proven to be challenging, current methods rely on manually specified goal spaces or subtasks, and no general solution exists. We introduce Director, a practical method for learning hierarchical behaviors directly from pixels by planning inside the latent space of a learned world model. The high-level policy maximizes task and exploration rewards by selecting latent goals and the low-level policy learns to achieve the goals. Despite operating in latent space, the decisions are interpretable because the world model can decode goals into images for visualization. Director outperforms exploration methods on tasks with sparse rewards, including 3D maze traversal with a quadruped robot from an egocentric camera and proprioception, without access to the global position or top-down view that was used by prior work. Director also learns successful behaviors across a wide range of environments, including visual control, Atari games, and DMLab levels.

Related papers

DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning [33.66640909392995]
We argue that solving complex and high-dimensional tasks requires solving simpler tasks that are relevant to the target task.<n>We propose a method for directed sparse-reward goal-conditioned very long-horizon RL (DISCOVER), which selects exploratory goals in the direction of the target task.
arXiv Detail & Related papers (2025-05-26T11:35:07Z)
Embodied Instruction Following in Unknown Environments [66.60163202450954]
We propose an embodied instruction following (EIF) method for complex tasks in the unknown environment. We build a hierarchical embodied instruction following framework including the high-level task planner and the low-level exploration controller. For the task planner, we generate the feasible step-by-step plans for human goal accomplishment according to the task completion process and the known visual clues.
arXiv Detail & Related papers (2024-06-17T17:55:40Z)
MENTOR: Guiding Hierarchical Reinforcement Learning with Human Feedback and Dynamic Distance Constraint [40.3872201560003]
Hierarchical reinforcement learning (HRL) uses a hierarchical framework that divides tasks into subgoals and completes them sequentially. Current methods struggle to find suitable subgoals for ensuring a stable learning process. We propose a general hierarchical reinforcement learning framework incorporating human feedback and dynamic distance constraints.
arXiv Detail & Related papers (2024-02-22T03:11:09Z)
Universal Visual Decomposer: Long-Horizon Manipulation Made Easy [54.93745986073738]
Real-world robotic tasks stretch over extended horizons and encompass multiple stages. Prior task decomposition methods require task-specific knowledge, are computationally intensive, and cannot readily be applied to new tasks. We propose Universal Visual Decomposer (UVD), an off-the-shelf task decomposition method for visual long horizon manipulation. We extensively evaluate UVD on both simulation and real-world tasks, and in all cases, UVD substantially outperforms baselines across imitation and reinforcement learning settings.
arXiv Detail & Related papers (2023-10-12T17:59:41Z)
Learning Hierarchical Interactive Multi-Object Search for Mobile Manipulation [10.21450780640562]
We introduce a novel interactive multi-object search task in which a robot has to open doors to navigate rooms and search inside cabinets and drawers to find target objects. These new challenges require combining manipulation and navigation skills in unexplored environments. We present HIMOS, a hierarchical reinforcement learning approach that learns to compose exploration, navigation, and manipulation skills.
arXiv Detail & Related papers (2023-07-12T12:25:33Z)
Efficient Learning of High Level Plans from Play [57.29562823883257]
We present Efficient Learning of High-Level Plans from Play (ELF-P), a framework for robotic learning that bridges motion planning and deep RL. We demonstrate that ELF-P has significantly better sample efficiency than relevant baselines over multiple realistic manipulation tasks.
arXiv Detail & Related papers (2023-03-16T20:09:47Z)
Discovering and Achieving Goals via World Models [61.95437238374288]
We introduce Latent Explorer Achiever (LEXA), a unified solution to this problem. LEXA learns a world model from image inputs and uses it to train an explorer and an achiever policy from imagined rollouts. After the unsupervised phase, LEXA solves tasks specified as goal images zero-shot without any additional learning.
arXiv Detail & Related papers (2021-10-18T17:59:58Z)
Efficient Robotic Object Search via HIEM: Hierarchical Policy Learning with Intrinsic-Extrinsic Modeling [33.89793938441333]
We present a novel policy learning paradigm for the object search task, based on hierarchical and interpretable modeling with an intrinsic-extrinsic reward setting. Experiments conducted on the House3D environment validate and show that the robot, trained with our model, can perform the object search task in a more optimal and interpretable way.
arXiv Detail & Related papers (2020-10-16T19:21:38Z)
GRIMGEP: Learning Progress for Robust Goal Sampling in Visual Deep Reinforcement Learning [21.661530291654692]
We propose a framework that allows agents to autonomously identify and ignore noisy distracting regions. Our framework can be combined with any state-of-the-art novelty seeking goal exploration approaches.
arXiv Detail & Related papers (2020-08-10T19:50:06Z)
Follow the Object: Curriculum Learning for Manipulation Tasks with Imagined Goals [8.98526174345299]
This paper introduces a notion of imaginary object goals. For a given manipulation task, the object of interest is first trained to reach a desired target position on its own. The object policy is then leveraged to build a predictive model of plausible object trajectories. The proposed algorithm, Follow the Object, has been evaluated on 7 MuJoCo environments.
arXiv Detail & Related papers (2020-08-05T12:19:14Z)
Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors. In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency. We propose setting up an automatic curriculum for goals that the agent needs to solve. We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.