Active Hierarchical Imitation and Reinforcement Learning
- URL: http://arxiv.org/abs/2012.07330v1
- Date: Mon, 14 Dec 2020 08:27:27 GMT
- Title: Active Hierarchical Imitation and Reinforcement Learning
- Authors: Yaru Niu, Yijun Gu
- Abstract summary: In this project, we explored different imitation learning algorithms and designed active learning algorithms upon the hierarchical imitation and reinforcement learning framework we have developed.
Our experimental results showed that using DAgger and reward-based active learning method can achieve better performance while saving more human efforts physically and mentally during the training process.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans can leverage hierarchical structures to split a task into sub-tasks
and solve problems efficiently. Both imitation and reinforcement learning or a
combination of them with hierarchical structures have been proven to be an
efficient way for robots to learn complex tasks with sparse rewards. However,
in the previous work of hierarchical imitation and reinforcement learning, the
tested environments are in relatively simple 2D games, and the action spaces
are discrete. Furthermore, many imitation learning works focusing on improving
the policies learned from the expert polices that are hard-coded or trained by
reinforcement learning algorithms, rather than human experts. In the scenarios
of human-robot interaction, humans can be required to provide demonstrations to
teach the robot, so it is crucial to improve the learning efficiency to reduce
expert efforts, and know human's perception about the learning/training
process. In this project, we explored different imitation learning algorithms
and designed active learning algorithms upon the hierarchical imitation and
reinforcement learning framework we have developed. We performed an experiment
where five participants were asked to guide a randomly initialized agent to a
random goal in a maze. Our experimental results showed that using DAgger and
reward-based active learning method can achieve better performance while saving
more human efforts physically and mentally during the training process.
Related papers
- SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation [58.14969377419633]
We propose spire, a system that decomposes tasks into smaller learning subproblems and second combines imitation and reinforcement learning to maximize their strengths.
We find that spire outperforms prior approaches that integrate imitation learning, reinforcement learning, and planning by 35% to 50% in average task performance.
arXiv Detail & Related papers (2024-10-23T17:42:07Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Human Decision Makings on Curriculum Reinforcement Learning with
Difficulty Adjustment [52.07473934146584]
We guide the curriculum reinforcement learning results towards a preferred performance level that is neither too hard nor too easy via learning from the human decision process.
Our system is highly parallelizable, making it possible for a human to train large-scale reinforcement learning applications.
It shows reinforcement learning performance can successfully adjust in sync with the human desired difficulty level.
arXiv Detail & Related papers (2022-08-04T23:53:51Z) - Physics-Guided Hierarchical Reward Mechanism for Learning-Based Robotic
Grasping [10.424363966870775]
We develop a Physics-Guided Deep Reinforcement Learning with a Hierarchical Reward Mechanism to improve learning efficiency and generalizability for learning-based autonomous grasping.
Our method is validated in robotic grasping tasks with a 3-finger MICO robot arm.
arXiv Detail & Related papers (2022-05-26T18:01:56Z) - Divide & Conquer Imitation Learning [75.31752559017978]
Imitation Learning can be a powerful approach to bootstrap the learning process.
We present a novel algorithm designed to imitate complex robotic tasks from the states of an expert trajectory.
We show that our method imitates a non-holonomic navigation task and scales to a complex simulated robotic manipulation task with very high sample efficiency.
arXiv Detail & Related papers (2022-04-15T09:56:50Z) - Prioritized Experience-based Reinforcement Learning with Human Guidance:
Methdology and Application to Autonomous Driving [2.5895890901896124]
Reinforcement learning requires skillful definition and remarkable computational efforts to solve optimization and control problems.
In this paper, a comprehensive human guidance-based reinforcement learning framework is established.
A novel prioritized experience replay mechanism that adapts to human guidance is proposed to boost the efficiency and performance of the reinforcement learning algorithm.
arXiv Detail & Related papers (2021-09-26T07:19:26Z) - Human-in-the-Loop Methods for Data-Driven and Reinforcement Learning
Systems [0.8223798883838329]
This research investigates how to integrate human interaction modalities to the reinforcement learning loop.
Results show that the reward signal that is learned based upon human interaction accelerates the rate of learning of reinforcement learning algorithms.
arXiv Detail & Related papers (2020-08-30T17:28:18Z) - Emergent Real-World Robotic Skills via Unsupervised Off-Policy
Reinforcement Learning [81.12201426668894]
We develop efficient reinforcement learning methods that acquire diverse skills without any reward function, and then repurpose these skills for downstream tasks.
We show that our proposed algorithm provides substantial improvement in learning efficiency, making reward-free real-world training feasible.
We also demonstrate that the learned skills can be composed using model predictive control for goal-oriented navigation, without any additional training.
arXiv Detail & Related papers (2020-04-27T17:38:53Z) - Learn Task First or Learn Human Partner First: A Hierarchical Task
Decomposition Method for Human-Robot Cooperation [11.387868752604986]
This work proposes a novel task decomposition method with a hierarchical reward mechanism that enables the robot to learn the hierarchical dynamic control task separately from learning the human partner's behavior.
The results show that the robot should learn the task first to achieve higher team performance and learn the human first to achieve higher learning efficiency.
arXiv Detail & Related papers (2020-03-01T04:41:49Z) - Scalable Multi-Task Imitation Learning with Autonomous Improvement [159.9406205002599]
We build an imitation learning system that can continuously improve through autonomous data collection.
We leverage the robot's own trials as demonstrations for tasks other than the one that the robot actually attempted.
In contrast to prior imitation learning approaches, our method can autonomously collect data with sparse supervision for continuous improvement.
arXiv Detail & Related papers (2020-02-25T18:56:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.