Learning to Guide Multiple Heterogeneous Actors from a Single Human
Demonstration via Automatic Curriculum Learning in StarCraft II
- URL: http://arxiv.org/abs/2205.05784v1
- Date: Wed, 11 May 2022 21:53:11 GMT
- Title: Learning to Guide Multiple Heterogeneous Actors from a Single Human
Demonstration via Automatic Curriculum Learning in StarCraft II
- Authors: Nicholas Waytowich, James Hare, Vinicius G. Goecks, Mark Mittrick,
John Richardson, Anjon Basak, Derrik E. Asher
- Abstract summary: In this work, we aim to train deep reinforcement learning agents that can command multiple heterogeneous actors.
Our results show that an agent trained via automated curriculum learning can outperform state-of-the-art deep reinforcement learning baselines.
- Score: 0.5911087507716211
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditionally, learning from human demonstrations via direct behavior cloning
can lead to high-performance policies given that the algorithm has access to
large amounts of high-quality data covering the most likely scenarios to be
encountered when the agent is operating. However, in real-world scenarios,
expert data is limited and it is desired to train an agent that learns a
behavior policy general enough to handle situations that were not demonstrated
by the human expert. Another alternative is to learn these policies with no
supervision via deep reinforcement learning, however, these algorithms require
a large amount of computing time to perform well on complex tasks with
high-dimensional state and action spaces, such as those found in StarCraft II.
Automatic curriculum learning is a recent mechanism comprised of techniques
designed to speed up deep reinforcement learning by adjusting the difficulty of
the current task to be solved according to the agent's current capabilities.
Designing a proper curriculum, however, can be challenging for sufficiently
complex tasks, and thus we leverage human demonstrations as a way to guide
agent exploration during training. In this work, we aim to train deep
reinforcement learning agents that can command multiple heterogeneous actors
where starting positions and overall difficulty of the task are controlled by
an automatically-generated curriculum from a single human demonstration. Our
results show that an agent trained via automated curriculum learning can
outperform state-of-the-art deep reinforcement learning baselines and match the
performance of the human expert in a simulated command and control task in
StarCraft II modeled over a real military scenario.
Related papers
- SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation [58.14969377419633]
We propose spire, a system that decomposes tasks into smaller learning subproblems and second combines imitation and reinforcement learning to maximize their strengths.
We find that spire outperforms prior approaches that integrate imitation learning, reinforcement learning, and planning by 35% to 50% in average task performance.
arXiv Detail & Related papers (2024-10-23T17:42:07Z) - Reinforcement Learning for UAV control with Policy and Reward Shaping [0.7127008801193563]
This study teaches an RL agent to control a drone using reward-shaping and policy-shaping techniques simultaneously.
The results show that an agent trained simultaneously with both techniques obtains a lower reward than an agent trained using only a policy-based approach.
arXiv Detail & Related papers (2022-12-06T14:46:13Z) - Robot Learning on the Job: Human-in-the-Loop Autonomy and Learning
During Deployment [25.186525630548356]
Sirius is a principled framework for humans and robots to collaborate through a division of work.
Partially autonomous robots are tasked with handling a major portion of decision-making where they work reliably.
We introduce a new learning algorithm to improve the policy's performance on the data collected from the task executions.
arXiv Detail & Related papers (2022-11-15T18:53:39Z) - Human Decision Makings on Curriculum Reinforcement Learning with
Difficulty Adjustment [52.07473934146584]
We guide the curriculum reinforcement learning results towards a preferred performance level that is neither too hard nor too easy via learning from the human decision process.
Our system is highly parallelizable, making it possible for a human to train large-scale reinforcement learning applications.
It shows reinforcement learning performance can successfully adjust in sync with the human desired difficulty level.
arXiv Detail & Related papers (2022-08-04T23:53:51Z) - Divide & Conquer Imitation Learning [75.31752559017978]
Imitation Learning can be a powerful approach to bootstrap the learning process.
We present a novel algorithm designed to imitate complex robotic tasks from the states of an expert trajectory.
We show that our method imitates a non-holonomic navigation task and scales to a complex simulated robotic manipulation task with very high sample efficiency.
arXiv Detail & Related papers (2022-04-15T09:56:50Z) - Active Hierarchical Imitation and Reinforcement Learning [0.0]
In this project, we explored different imitation learning algorithms and designed active learning algorithms upon the hierarchical imitation and reinforcement learning framework we have developed.
Our experimental results showed that using DAgger and reward-based active learning method can achieve better performance while saving more human efforts physically and mentally during the training process.
arXiv Detail & Related papers (2020-12-14T08:27:27Z) - Human-in-the-Loop Imitation Learning using Remote Teleoperation [72.2847988686463]
We build a data collection system tailored to 6-DoF manipulation settings.
We develop an algorithm to train the policy iteratively on new data collected by the system.
We demonstrate that agents trained on data collected by our intervention-based system and algorithm outperform agents trained on an equivalent number of samples collected by non-interventional demonstrators.
arXiv Detail & Related papers (2020-12-12T05:30:35Z) - Continual Learning of Control Primitives: Skill Discovery via
Reset-Games [128.36174682118488]
We show how a single method can allow an agent to acquire skills with minimal supervision.
We do this by exploiting the insight that the need to "reset" an agent to a broad set of initial states for a learning task provides a natural setting to learn a diverse set of "reset-skills"
arXiv Detail & Related papers (2020-11-10T18:07:44Z) - Planning to Explore via Self-Supervised World Models [120.31359262226758]
Plan2Explore is a self-supervised reinforcement learning agent.
We present a new approach to self-supervised exploration and fast adaptation to new tasks.
Without any training supervision or task-specific interaction, Plan2Explore outperforms prior self-supervised exploration methods.
arXiv Detail & Related papers (2020-05-12T17:59:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.