Adaptive action supervision in reinforcement learning from real-world
multi-agent demonstrations
- URL: http://arxiv.org/abs/2305.13030v4
- Date: Tue, 19 Dec 2023 13:29:33 GMT
- Title: Adaptive action supervision in reinforcement learning from real-world
multi-agent demonstrations
- Authors: Keisuke Fujii, Kazushi Tsutsui, Atom Scott, Hiroshi Nakahara, Naoya
Takeishi, Yoshinobu Kawahara
- Abstract summary: We propose a method for adaptive action supervision in RL from real-world demonstrations in multi-agent scenarios.
In the experiments, using chase-and-escape and football tasks with the different dynamics between the unknown source and target environments, we show that our approach achieved a balance between the generalization and the generalization ability compared with the baselines.
- Score: 10.174009792409928
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Modeling of real-world biological multi-agents is a fundamental problem in
various scientific and engineering fields. Reinforcement learning (RL) is a
powerful framework to generate flexible and diverse behaviors in cyberspace;
however, when modeling real-world biological multi-agents, there is a domain
gap between behaviors in the source (i.e., real-world data) and the target
(i.e., cyberspace for RL), and the source environment parameters are usually
unknown. In this paper, we propose a method for adaptive action supervision in
RL from real-world demonstrations in multi-agent scenarios. We adopt an
approach that combines RL and supervised learning by selecting actions of
demonstrations in RL based on the minimum distance of dynamic time warping for
utilizing the information of the unknown source dynamics. This approach can be
easily applied to many existing neural network architectures and provide us
with an RL model balanced between reproducibility as imitation and
generalization ability to obtain rewards in cyberspace. In the experiments,
using chase-and-escape and football tasks with the different dynamics between
the unknown source and target environments, we show that our approach achieved
a balance between the reproducibility and the generalization ability compared
with the baselines. In particular, we used the tracking data of professional
football players as expert demonstrations in football and show successful
performances despite the larger gap between behaviors in the source and target
environments than the chase-and-escape task.
Related papers
- Online Decision MetaMorphFormer: A Casual Transformer-Based Reinforcement Learning Framework of Universal Embodied Intelligence [2.890656584329591]
Online Decision MetaMorphFormer (ODM) aims to achieve self-awareness, environment recognition, and action planning.
ODM can be applied to any arbitrary agent with a multi-joint body, located in different environments, and trained with different types of tasks using large-scale pre-trained datasets.
arXiv Detail & Related papers (2024-09-11T15:22:43Z) - An Interactive Agent Foundation Model [49.77861810045509]
We propose an Interactive Agent Foundation Model that uses a novel multi-task agent training paradigm for training AI agents.
Our training paradigm unifies diverse pre-training strategies, including visual masked auto-encoders, language modeling, and next-action prediction.
We demonstrate the performance of our framework across three separate domains -- Robotics, Gaming AI, and Healthcare.
arXiv Detail & Related papers (2024-02-08T18:58:02Z) - The RL Perceptron: Generalisation Dynamics of Policy Learning in High
Dimensions [14.778024171498208]
Reinforcement learning algorithms have proven transformative in a range of domains.
Much theory of RL has focused on discrete state spaces or worst-case analysis.
We propose a solvable high-dimensional model of RL that can capture a variety of learning protocols.
arXiv Detail & Related papers (2023-06-17T18:16:51Z) - Predictive Experience Replay for Continual Visual Control and
Forecasting [62.06183102362871]
We present a new continual learning approach for visual dynamics modeling and explore its efficacy in visual control and forecasting.
We first propose the mixture world model that learns task-specific dynamics priors with a mixture of Gaussians, and then introduce a new training strategy to overcome catastrophic forgetting.
Our model remarkably outperforms the naive combinations of existing continual learning and visual RL algorithms on DeepMind Control and Meta-World benchmarks with continual visual control tasks.
arXiv Detail & Related papers (2023-03-12T05:08:03Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Non-Markovian Reinforcement Learning using Fractional Dynamics [3.000697999889031]
Reinforcement learning (RL) is a technique to learn the control policy for an agent that interacts with an environment.
In this paper, we propose a model-based RL technique for a system that has non-Markovian dynamics.
Such environments are common in many real-world applications such as in human physiology, biological systems, material science, and population dynamics.
arXiv Detail & Related papers (2021-07-29T07:35:13Z) - Scenic4RL: Programmatic Modeling and Generation of Reinforcement
Learning Environments [89.04823188871906]
Generation of diverse realistic scenarios is challenging for real-time strategy (RTS) environments.
Most of the existing simulators rely on randomly generating the environments.
We introduce the benefits of adopting an existing formal scenario specification language, SCENIC, to assist researchers.
arXiv Detail & Related papers (2021-06-18T21:49:46Z) - The AI Arena: A Framework for Distributed Multi-Agent Reinforcement
Learning [0.3437656066916039]
We introduce the AI Arena: a scalable framework with flexible abstractions for distributed multi-agent reinforcement learning.
We show performance gains due to a distributed multi-agent learning approach over commonly-used RL techniques in several different learning environments.
arXiv Detail & Related papers (2021-03-09T22:16:19Z) - Trajectory-wise Multiple Choice Learning for Dynamics Generalization in
Reinforcement Learning [137.39196753245105]
We present a new model-based reinforcement learning algorithm that learns a multi-headed dynamics model for dynamics generalization.
We incorporate context learning, which encodes dynamics-specific information from past experiences into the context latent vector.
Our method exhibits superior zero-shot generalization performance across a variety of control tasks, compared to state-of-the-art RL methods.
arXiv Detail & Related papers (2020-10-26T03:20:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.