Human-Inspired Multi-Agent Navigation using Knowledge Distillation
- URL: http://arxiv.org/abs/2103.10000v5
- Date: Tue, 29 Aug 2023 00:09:24 GMT
- Title: Human-Inspired Multi-Agent Navigation using Knowledge Distillation
- Authors: Pei Xu and Ioannis Karamouzas
- Abstract summary: We propose a framework for learning a human-like general collision avoidance policy for agent-agent interactions.
Our approach uses knowledge distillation with reinforcement learning to shape the reward function.
We show that agents trained with our approach can take human-like trajectories in collision avoidance and goal-directed steering tasks.
- Score: 4.659427498118277
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite significant advancements in the field of multi-agent navigation,
agents still lack the sophistication and intelligence that humans exhibit in
multi-agent settings. In this paper, we propose a framework for learning a
human-like general collision avoidance policy for agent-agent interactions in
fully decentralized, multi-agent environments. Our approach uses knowledge
distillation with reinforcement learning to shape the reward function based on
expert policies extracted from human trajectory demonstrations through behavior
cloning. We show that agents trained with our approach can take human-like
trajectories in collision avoidance and goal-directed steering tasks not
provided by the demonstrations, outperforming the experts as well as
learning-based agents trained without knowledge distillation.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Beyond Rewards: a Hierarchical Perspective on Offline Multiagent
Behavioral Analysis [14.656957226255628]
We introduce a model-agnostic method for discovery of behavior clusters in multiagent domains.
Our framework makes no assumption about agents' underlying learning algorithms, does not require access to their latent states or models, and can be trained using entirely offline observational data.
arXiv Detail & Related papers (2022-06-17T23:07:33Z) - Explaining Reinforcement Learning Policies through Counterfactual
Trajectories [147.7246109100945]
A human developer must validate that an RL agent will perform well at test-time.
Our method conveys how the agent performs under distribution shifts by showing the agent's behavior across a wider trajectory distribution.
In a user study, we demonstrate that our method enables users to score better than baseline methods on one of two agent validation tasks.
arXiv Detail & Related papers (2022-01-29T00:52:37Z) - Relative Distributed Formation and Obstacle Avoidance with Multi-agent
Reinforcement Learning [20.401609420707867]
We propose a distributed formation and obstacle avoidance method based on multi-agent reinforcement learning (MARL)
Our method achieves better performance regarding formation error, formation convergence rate and on-par success rate of obstacle avoidance compared with baselines.
arXiv Detail & Related papers (2021-11-14T13:02:45Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z) - PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning.
We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z) - Learning Latent Representations to Influence Multi-Agent Interaction [65.44092264843538]
We propose a reinforcement learning-based framework for learning latent representations of an agent's policy.
We show that our approach outperforms the alternatives and learns to influence the other agent.
arXiv Detail & Related papers (2020-11-12T19:04:26Z) - Learning to Incentivize Other Learning Agents [73.03133692589532]
We show how to equip RL agents with the ability to give rewards directly to other agents, using a learned incentive function.
Such agents significantly outperform standard RL and opponent-shaping agents in challenging general-sum Markov games.
Our work points toward more opportunities and challenges along the path to ensure the common good in a multi-agent future.
arXiv Detail & Related papers (2020-06-10T20:12:38Z) - Skill Discovery of Coordination in Multi-agent Reinforcement Learning [41.67943127631515]
We propose "Multi-agent Skill Discovery"(MASD), a method for discovering skills for coordination patterns of multiple agents.
We show the emergence of various skills on the level of coordination in a general particle multi-agent environment.
We also reveal that the "bottleneck" prevents skills from collapsing to a single agent and enhances the diversity of learned skills.
arXiv Detail & Related papers (2020-06-07T02:04:15Z) - Towards Learning Multi-agent Negotiations via Self-Play [2.28438857884398]
We show how an iterative procedure of self-play can create progressively more diverse environments.
This leads to the learning of sophisticated and robust multi-agent policies.
We show a dramatic improvement in the success rate of merging maneuvers from 63% to over 98%.
arXiv Detail & Related papers (2020-01-28T08:37:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.