Multi-Agent Inverse Reinforcement Learning: Suboptimal Demonstrations
and Alternative Solution Concepts
- URL: http://arxiv.org/abs/2109.01178v1
- Date: Thu, 2 Sep 2021 19:15:29 GMT
- Title: Multi-Agent Inverse Reinforcement Learning: Suboptimal Demonstrations
and Alternative Solution Concepts
- Authors: Sage Bergerson
- Abstract summary: Multi-agent inverse reinforcement learning can be used to learn reward functions from agents in social environments.
To model realistic social dynamics, MIRL methods must account for suboptimal human reasoning and behavior.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-agent inverse reinforcement learning (MIRL) can be used to learn reward
functions from agents in social environments. To model realistic social
dynamics, MIRL methods must account for suboptimal human reasoning and
behavior. Traditional formalisms of game theory provide computationally
tractable behavioral models, but assume agents have unrealistic cognitive
capabilities. This research identifies and compares mechanisms in MIRL methods
which a) handle noise, biases and heuristics in agent decision making and b)
model realistic equilibrium solution concepts. MIRL research is systematically
reviewed to identify solutions for these challenges. The methods and results of
these studies are analyzed and compared based on factors including performance
accuracy, efficiency, and descriptive quality. We found that the primary
methods for handling noise, biases and heuristics in MIRL were extensions of
Maximum Entropy (MaxEnt) IRL to multi-agent settings. We also found that many
successful solution concepts are generalizations of the traditional Nash
Equilibrium (NE). These solutions include the correlated equilibrium, logistic
stochastic best response equilibrium and entropy regularized mean field NE.
Methods which use recursive reasoning or updating also perform well, including
the feedback NE and archive multi-agent adversarial IRL. Success in modeling
specific biases and heuristics in single-agent IRL and promising results using
a Theory of Mind approach in MIRL imply that modeling specific biases and
heuristics may be useful. Flexibility and unbiased inference in the identified
alternative solution concepts suggest that a solution concept which has both
recursive and generalized characteristics may perform well at modeling
realistic social interactions.
Related papers
- PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development.
We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - A Single Online Agent Can Efficiently Learn Mean Field Games [16.00164239349632]
Mean field games (MFGs) are a promising framework for modeling the behavior of large-population systems.
This paper introduces a novel online single-agent model-free learning scheme, which enables a single agent to learn MFNE using online samples.
arXiv Detail & Related papers (2024-05-05T16:38:04Z) - Variational Inference of Parameters in Opinion Dynamics Models [9.51311391391997]
This work uses variational inference to estimate the parameters of an opinion dynamics ABM.
We transform the inference process into an optimization problem suitable for automatic differentiation.
Our approach estimates both macroscopic (bounded confidence intervals and backfire thresholds) and microscopic ($200$ categorical, agent-level roles) more accurately than simulation-based and MCMC methods.
arXiv Detail & Related papers (2024-03-08T14:45:18Z) - Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation.
We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z) - Learning and Calibrating Heterogeneous Bounded Rational Market Behaviour
with Multi-Agent Reinforcement Learning [4.40301653518681]
Agent-based models (ABMs) have shown promise for modelling various real world phenomena incompatible with traditional equilibrium analysis.
Recent developments in multi-agent reinforcement learning (MARL) offer a way to address this issue from a rationality perspective.
We propose a novel technique for representing heterogeneous processing-constrained agents within a MARL framework.
arXiv Detail & Related papers (2024-02-01T17:21:45Z) - Human Trajectory Forecasting with Explainable Behavioral Uncertainty [63.62824628085961]
Human trajectory forecasting helps to understand and predict human behaviors, enabling applications from social robots to self-driving cars.
Model-free methods offer superior prediction accuracy but lack explainability, while model-based methods provide explainability but cannot predict well.
We show that BNSP-SFM achieves up to a 50% improvement in prediction accuracy, compared with 11 state-of-the-art methods.
arXiv Detail & Related papers (2023-07-04T16:45:21Z) - Concept Learning for Interpretable Multi-Agent Reinforcement Learning [5.179808182296037]
We introduce a method for incorporating interpretable concepts from a domain expert into models trained through multi-agent reinforcement learning.
This allows an expert to both reason about the resulting concept policy models in terms of these high-level concepts at run-time, as well as intervene and correct mispredictions to improve performance.
We show that this yields improved interpretability and training stability, with benefits to policy performance and sample efficiency in a simulated and real-world cooperative-competitive multi-agent game.
arXiv Detail & Related papers (2023-02-23T18:53:09Z) - Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline
Reinforcement Learning [114.36124979578896]
We design a dynamic mechanism using offline reinforcement learning algorithms.
Our algorithm is based on the pessimism principle and only requires a mild assumption on the coverage of the offline data set.
arXiv Detail & Related papers (2022-05-05T05:44:26Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - On the model-based stochastic value gradient for continuous
reinforcement learning [50.085645237597056]
We show that simple model-based agents can outperform state-of-the-art model-free agents in terms of both sample-efficiency and final reward.
Our findings suggest that model-based policy evaluation deserves closer attention.
arXiv Detail & Related papers (2020-08-28T17:58:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.