Related papers: Learning to Guide Multiple Heterogeneous Actors from a Single Human Demonstration via Automatic Curriculum Learning in StarCraft II

Learning to Guide Multiple Heterogeneous Actors from a Single Human Demonstration via Automatic Curriculum Learning in StarCraft II

URL: http://arxiv.org/abs/2205.05784v1
Date: Wed, 11 May 2022 21:53:11 GMT
Title: Learning to Guide Multiple Heterogeneous Actors from a Single Human Demonstration via Automatic Curriculum Learning in StarCraft II
Authors: Nicholas Waytowich, James Hare, Vinicius G. Goecks, Mark Mittrick, John Richardson, Anjon Basak, Derrik E. Asher
Abstract summary: In this work, we aim to train deep reinforcement learning agents that can command multiple heterogeneous actors. Our results show that an agent trained via automated curriculum learning can outperform state-of-the-art deep reinforcement learning baselines.
Score: 0.5911087507716211
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Traditionally, learning from human demonstrations via direct behavior cloning can lead to high-performance policies given that the algorithm has access to large amounts of high-quality data covering the most likely scenarios to be encountered when the agent is operating. However, in real-world scenarios, expert data is limited and it is desired to train an agent that learns a behavior policy general enough to handle situations that were not demonstrated by the human expert. Another alternative is to learn these policies with no supervision via deep reinforcement learning, however, these algorithms require a large amount of computing time to perform well on complex tasks with high-dimensional state and action spaces, such as those found in StarCraft II. Automatic curriculum learning is a recent mechanism comprised of techniques designed to speed up deep reinforcement learning by adjusting the difficulty of the current task to be solved according to the agent's current capabilities. Designing a proper curriculum, however, can be challenging for sufficiently complex tasks, and thus we leverage human demonstrations as a way to guide agent exploration during training. In this work, we aim to train deep reinforcement learning agents that can command multiple heterogeneous actors where starting positions and overall difficulty of the task are controlled by an automatically-generated curriculum from a single human demonstration. Our results show that an agent trained via automated curriculum learning can outperform state-of-the-art deep reinforcement learning baselines and match the performance of the human expert in a simulated command and control task in StarCraft II modeled over a real military scenario.

Related papers

SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation [58.14969377419633]
We propose spire, a system that decomposes tasks into smaller learning subproblems and second combines imitation and reinforcement learning to maximize their strengths. We find that spire outperforms prior approaches that integrate imitation learning, reinforcement learning, and planning by 35% to 50% in average task performance.
arXiv Detail & Related papers (2024-10-23T17:42:07Z)
Reinforcement Learning for UAV control with Policy and Reward Shaping [0.7127008801193563]
This study teaches an RL agent to control a drone using reward-shaping and policy-shaping techniques simultaneously. The results show that an agent trained simultaneously with both techniques obtains a lower reward than an agent trained using only a policy-based approach.
arXiv Detail & Related papers (2022-12-06T14:46:13Z)
Robot Learning on the Job: Human-in-the-Loop Autonomy and Learning During Deployment [25.186525630548356]
Sirius is a principled framework for humans and robots to collaborate through a division of work. Partially autonomous robots are tasked with handling a major portion of decision-making where they work reliably. We introduce a new learning algorithm to improve the policy's performance on the data collected from the task executions.
arXiv Detail & Related papers (2022-11-15T18:53:39Z)
Human Decision Makings on Curriculum Reinforcement Learning with Difficulty Adjustment [52.07473934146584]
We guide the curriculum reinforcement learning results towards a preferred performance level that is neither too hard nor too easy via learning from the human decision process. Our system is highly parallelizable, making it possible for a human to train large-scale reinforcement learning applications. It shows reinforcement learning performance can successfully adjust in sync with the human desired difficulty level.
arXiv Detail & Related papers (2022-08-04T23:53:51Z)
Divide & Conquer Imitation Learning [75.31752559017978]
Imitation Learning can be a powerful approach to bootstrap the learning process. We present a novel algorithm designed to imitate complex robotic tasks from the states of an expert trajectory. We show that our method imitates a non-holonomic navigation task and scales to a complex simulated robotic manipulation task with very high sample efficiency.
arXiv Detail & Related papers (2022-04-15T09:56:50Z)
Active Hierarchical Imitation and Reinforcement Learning [0.0]
In this project, we explored different imitation learning algorithms and designed active learning algorithms upon the hierarchical imitation and reinforcement learning framework we have developed. Our experimental results showed that using DAgger and reward-based active learning method can achieve better performance while saving more human efforts physically and mentally during the training process.
arXiv Detail & Related papers (2020-12-14T08:27:27Z)
Human-in-the-Loop Imitation Learning using Remote Teleoperation [72.2847988686463]
We build a data collection system tailored to 6-DoF manipulation settings. We develop an algorithm to train the policy iteratively on new data collected by the system. We demonstrate that agents trained on data collected by our intervention-based system and algorithm outperform agents trained on an equivalent number of samples collected by non-interventional demonstrators.
arXiv Detail & Related papers (2020-12-12T05:30:35Z)
Continual Learning of Control Primitives: Skill Discovery via Reset-Games [128.36174682118488]
We show how a single method can allow an agent to acquire skills with minimal supervision. We do this by exploiting the insight that the need to "reset" an agent to a broad set of initial states for a learning task provides a natural setting to learn a diverse set of "reset-skills"
arXiv Detail & Related papers (2020-11-10T18:07:44Z)
Planning to Explore via Self-Supervised World Models [120.31359262226758]
Plan2Explore is a self-supervised reinforcement learning agent. We present a new approach to self-supervised exploration and fast adaptation to new tasks. Without any training supervision or task-specific interaction, Plan2Explore outperforms prior self-supervised exploration methods.
arXiv Detail & Related papers (2020-05-12T17:59:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.