Diversity in Action: General-Sum Multi-Agent Continuous Inverse Optimal
Control
- URL: http://arxiv.org/abs/2004.12678v1
- Date: Mon, 27 Apr 2020 09:53:52 GMT
- Title: Diversity in Action: General-Sum Multi-Agent Continuous Inverse Optimal
Control
- Authors: Christian Muench, Frans A. Oliehoek, Dariu M. Gavrila
- Abstract summary: We present a game-theoretic method that addresses all of the points above.
Compared to many existing methods used for AVs, our approach does 1) not require perfect communication, and 2) allows for individual rewards per agent.
Our experiments demonstrate that these more realistic assumptions lead to qualitatively and quantitatively different reward inference and prediction of future actions that match better with expected real-world behaviour.
- Score: 12.771478252695399
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traffic scenarios are inherently interactive. Multiple decision-makers
predict the actions of others and choose strategies that maximize their
rewards. We view these interactions from the perspective of game theory which
introduces various challenges. Humans are not entirely rational, their rewards
need to be inferred from real-world data, and any prediction algorithm needs to
be real-time capable so that we can use it in an autonomous vehicle (AV). In
this work, we present a game-theoretic method that addresses all of the points
above. Compared to many existing methods used for AVs, our approach does 1) not
require perfect communication, and 2) allows for individual rewards per agent.
Our experiments demonstrate that these more realistic assumptions lead to
qualitatively and quantitatively different reward inference and prediction of
future actions that match better with expected real-world behaviour.
Related papers
- Auto-Encoding Bayesian Inverse Games [36.06617326128679]
We consider the inverse game problem, in which some properties of the game are unknown a priori.
Existing maximum likelihood estimation approaches to solve inverse games provide only point estimates of unknown parameters.
We take a Bayesian perspective and construct posterior distributions of game parameters.
This structured VAE can be trained from an unlabeled dataset of observed interactions.
arXiv Detail & Related papers (2024-02-14T02:17:37Z) - A Minimaximalist Approach to Reinforcement Learning from Human Feedback [49.45285664482369]
We present Self-Play Preference Optimization (SPO), an algorithm for reinforcement learning from human feedback.
Our approach is minimalist in that it does not require training a reward model nor unstable adversarial training.
We demonstrate that on a suite of continuous control tasks, we are able to learn significantly more efficiently than reward-model based approaches.
arXiv Detail & Related papers (2024-01-08T17:55:02Z) - Promptable Behaviors: Personalizing Multi-Objective Rewards from Human
Preferences [53.353022588751585]
We present Promptable Behaviors, a novel framework that facilitates efficient personalization of robotic agents to diverse human preferences.
We introduce three distinct methods to infer human preferences by leveraging different types of interactions.
We evaluate the proposed method in personalized object-goal navigation and flee navigation tasks in ProcTHOR and RoboTHOR.
arXiv Detail & Related papers (2023-12-14T21:00:56Z) - Learning Human Rewards by Inferring Their Latent Intelligence Levels in
Multi-Agent Games: A Theory-of-Mind Approach with Application to Driving Data [18.750834997334664]
We argue that humans are bounded rational and have different intelligence levels when reasoning about others' decision-making process.
We propose a new multi-agent Inverse Reinforcement Learning framework that reasons about humans' latent intelligence levels during learning.
arXiv Detail & Related papers (2021-03-07T07:48:31Z) - Instance-Aware Predictive Navigation in Multi-Agent Environments [93.15055834395304]
We propose an Instance-Aware Predictive Control (IPC) approach, which forecasts interactions between agents as well as future scene structures.
We adopt a novel multi-instance event prediction module to estimate the possible interaction among agents in the ego-centric view.
We design a sequential action sampling strategy to better leverage predicted states on both scene-level and instance-level.
arXiv Detail & Related papers (2021-01-14T22:21:25Z) - Learning to Anticipate Egocentric Actions by Imagination [60.21323541219304]
We study the egocentric action anticipation task, which predicts future action seconds before it is performed for egocentric videos.
Our method significantly outperforms previous methods on both the seen test set and the unseen test set of the EPIC Kitchens Action Anticipation Challenge.
arXiv Detail & Related papers (2021-01-13T08:04:10Z) - Pedestrian Behavior Prediction via Multitask Learning and Categorical
Interaction Modeling [13.936894582450734]
We propose a multitask learning framework that simultaneously predicts trajectories and actions of pedestrians by relying on multimodal data.
We show that our model achieves state-of-the-art performance and improves trajectory and action prediction by up to 22% and 6% respectively.
arXiv Detail & Related papers (2020-12-06T15:57:11Z) - Future Frame Prediction of a Video Sequence [5.660207256468971]
The ability to predict, anticipate and reason about future events is the essence of intelligence.
The ability to predict, anticipate and reason about future events is the essence of intelligence.
arXiv Detail & Related papers (2020-08-31T15:31:02Z) - SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction [72.37440317774556]
We propose advances that address two key challenges in future trajectory prediction.
multimodality in both training data and predictions and constant time inference regardless of number of agents.
arXiv Detail & Related papers (2020-07-26T08:17:10Z) - Diverse and Admissible Trajectory Forecasting through Multimodal Context
Understanding [46.52703817997932]
Multi-agent trajectory forecasting in autonomous driving requires an agent to accurately anticipate the behaviors of the surrounding vehicles and pedestrians.
We propose a model that synthesizes multiple input signals from the multimodal world.
We show a significant performance improvement over previous state-of-the-art methods.
arXiv Detail & Related papers (2020-03-06T13:59:39Z) - Never Give Up: Learning Directed Exploration Strategies [63.19616370038824]
We propose a reinforcement learning agent to solve hard exploration games by learning a range of directed exploratory policies.
We construct an episodic memory-based intrinsic reward using k-nearest neighbors over the agent's recent experience to train the directed exploratory policies.
A self-supervised inverse dynamics model is used to train the embeddings of the nearest neighbour lookup, biasing the novelty signal towards what the agent can control.
arXiv Detail & Related papers (2020-02-14T13:57:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.