NEARL: Non-Explicit Action Reinforcement Learning for Robotic Control
- URL: http://arxiv.org/abs/2011.01046v1
- Date: Mon, 2 Nov 2020 15:28:19 GMT
- Title: NEARL: Non-Explicit Action Reinforcement Learning for Robotic Control
- Authors: Nan Lin, Yuxuan Li, Yujun Zhu, Ruolin Wang, Xiayu Zhang, Jianmin Ji,
Keke Tang, Xiaoping Chen, Xinming Zhang
- Abstract summary: In this paper, we propose a novel hierarchical reinforcement learning framework without explicit action.
Our meta policy tries to manipulate the next optimal state and actual action is produced by the inverse dynamics model.
Under our framework, widely available state-only demonstrations can be exploited effectively for imitation learning.
- Score: 15.720231070808696
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditionally, reinforcement learning methods predict the next action based
on the current state. However, in many situations, directly applying actions to
control systems or robots is dangerous and may lead to unexpected behaviors
because action is rather low-level. In this paper, we propose a novel
hierarchical reinforcement learning framework without explicit action. Our meta
policy tries to manipulate the next optimal state and actual action is produced
by the inverse dynamics model. To stabilize the training process, we integrate
adversarial learning and information bottleneck into our framework. Under our
framework, widely available state-only demonstrations can be exploited
effectively for imitation learning. Also, prior knowledge and constraints can
be applied to meta policy. We test our algorithm in simulation tasks and its
combination with imitation learning. The experimental results show the
reliability and robustness of our algorithms.
Related papers
- RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Leveraging Sequentiality in Reinforcement Learning from a Single
Demonstration [68.94506047556412]
We propose to leverage a sequential bias to learn control policies for complex robotic tasks using a single demonstration.
We show that DCIL-II can solve with unprecedented sample efficiency some challenging simulated tasks such as humanoid locomotion and stand-up.
arXiv Detail & Related papers (2022-11-09T10:28:40Z) - Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and
Heuristic Rule-based Methods for Object Manipulation [118.27432851053335]
This paper presents an overview and comparative analysis of our systems designed for the following two tracks in SAPIEN ManiSkill Challenge 2021: No Interaction Track.
The No Interaction track targets for learning policies from pre-collected demonstration trajectories.
In this track, we design a Heuristic Rule-based Method (HRM) to trigger high-quality object manipulation by decomposing the task into a series of sub-tasks.
For each sub-task, the simple rule-based controlling strategies are adopted to predict actions that can be applied to robotic arms.
arXiv Detail & Related papers (2022-06-13T16:20:42Z) - Action-Conditioned Contrastive Policy Pretraining [39.13710045468429]
Deep visuomotor policy learning achieves promising results in control tasks such as robotic manipulation and autonomous driving.
It requires a huge number of online interactions with the training environment, which limits its real-world application.
In this work, we aim to pretrain policy representations for driving tasks using hours-long uncurated YouTube videos.
arXiv Detail & Related papers (2022-04-05T17:58:22Z) - Robust Learning from Observation with Model Misspecification [33.92371002674386]
Imitation learning (IL) is a popular paradigm for training policies in robotic systems.
We propose a robust IL algorithm to learn policies that can effectively transfer to the real environment without fine-tuning.
arXiv Detail & Related papers (2022-02-12T07:04:06Z) - Real-World Dexterous Object Manipulation based Deep Reinforcement
Learning [3.4493195428573613]
We show how to use deep reinforcement learning to control a robot.
Our framework reduces the disadvantage of low sample efficiency of deep reinforcement learning.
Our algorithm is trained in simulation and migrated to reality without fine-tuning.
arXiv Detail & Related papers (2021-11-22T02:48:05Z) - Neural Dynamic Policies for End-to-End Sensorimotor Learning [51.24542903398335]
The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces.
We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space.
NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks.
arXiv Detail & Related papers (2020-12-04T18:59:32Z) - Guided Uncertainty-Aware Policy Optimization: Combining Learning and
Model-Based Strategies for Sample-Efficient Policy Learning [75.56839075060819]
Traditional robotic approaches rely on an accurate model of the environment, a detailed description of how to perform the task, and a robust perception system to keep track of the current state.
reinforcement learning approaches can operate directly from raw sensory inputs with only a reward signal to describe the task, but are extremely sample-inefficient and brittle.
In this work, we combine the strengths of model-based methods with the flexibility of learning-based methods to obtain a general method that is able to overcome inaccuracies in the robotics perception/actuation pipeline.
arXiv Detail & Related papers (2020-05-21T19:47:05Z) - State-Only Imitation Learning for Dexterous Manipulation [63.03621861920732]
In this paper, we explore state-only imitation learning.
We train an inverse dynamics model and use it to predict actions for state-only demonstrations.
Our method performs on par with state-action approaches and considerably outperforms RL alone.
arXiv Detail & Related papers (2020-04-07T17:57:20Z) - Learning Whole-body Motor Skills for Humanoids [25.443880385966114]
This paper presents a hierarchical framework for Deep Reinforcement Learning that acquires motor skills for a variety of push recovery and balancing behaviors.
The policy is trained in a physics simulator with realistic setting of robot model and low-level impedance control that are easy to transfer the learned skills to real robots.
arXiv Detail & Related papers (2020-02-07T19:40:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.