Meta-Inverse Reinforcement Learning for Mean Field Games via Probabilistic Context Variables
- URL: http://arxiv.org/abs/2509.03845v1
- Date: Thu, 04 Sep 2025 03:13:11 GMT
- Title: Meta-Inverse Reinforcement Learning for Mean Field Games via Probabilistic Context Variables
- Authors: Yang Chen, Xiao Lin, Bo Yan, Libo Zhang, Jiamou Liu, Neset Özkan Tan, Michael Witbrock,
- Abstract summary: Inverse reinforcement learning offers a framework to infer reward functions from expert demonstrations.<n>We propose a deep latent variable MFG model and an associated IRL method.<n>Our method can infer rewards from different yet structurally similar tasks without prior knowledge about underlying contexts.
- Score: 27.845927777359723
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Designing suitable reward functions for numerous interacting intelligent agents is challenging in real-world applications. Inverse reinforcement learning (IRL) in mean field games (MFGs) offers a practical framework to infer reward functions from expert demonstrations. While promising, the assumption of agent homogeneity limits the capability of existing methods to handle demonstrations with heterogeneous and unknown objectives, which are common in practice. To this end, we propose a deep latent variable MFG model and an associated IRL method. Critically, our method can infer rewards from different yet structurally similar tasks without prior knowledge about underlying contexts or modifying the MFG model itself. Our experiments, conducted on simulated scenarios and a real-world spatial taxi-ride pricing problem, demonstrate the superiority of our approach over state-of-the-art IRL methods in MFGs.
Related papers
- Latent Wasserstein Adversarial Imitation Learning [110.12916356445908]
Imitation Learning (IL) enables agents to mimic expert behavior by learning from demonstrations.<n>We propose Latent Wasserstein Adrial Imitation Learning (LWAIL), a novel adversarial imitation learning framework.<n>We show that our method outperforms prior Wasserstein-based IL methods and prior adversarial IL methods.
arXiv Detail & Related papers (2026-03-05T18:01:49Z) - Decoding Rewards in Competitive Games: Inverse Game Theory with Entropy Regularization [52.74762030521324]
We propose a novel algorithm to learn reward functions from observed actions.<n>We provide strong theoretical guarantees for the reliability and sample efficiency of our algorithm.
arXiv Detail & Related papers (2026-01-19T04:12:51Z) - Inverse-RLignment: Large Language Model Alignment from Demonstrations through Inverse Reinforcement Learning [62.05713042908654]
We introduce Alignment from Demonstrations (AfD), a novel approach leveraging high-quality demonstration data to overcome these challenges.<n>We formalize AfD within a sequential decision-making framework, highlighting its unique challenge of missing reward signals.<n> Practically, we propose a computationally efficient algorithm that extrapolates over a tailored reward model for AfD.
arXiv Detail & Related papers (2024-05-24T15:13:53Z) - Reparameterized Policy Learning for Multimodal Trajectory Optimization [61.13228961771765]
We investigate the challenge of parametrizing policies for reinforcement learning in high-dimensional continuous action spaces.
We propose a principled framework that models the continuous RL policy as a generative model of optimal trajectories.
We present a practical model-based RL method, which leverages the multimodal policy parameterization and learned world model.
arXiv Detail & Related papers (2023-07-20T09:05:46Z) - Reward Learning using Structural Motifs in Inverse Reinforcement
Learning [3.04585143845864]
Inverse Reinforcement Learning (textitIRL) problem has seen rapid evolution in the past few years, with important applications in domains like robotics, cognition, and health.
We explore the inefficacy of current IRL methods in learning an agent's reward function from expert trajectories depicting long-horizon, complex sequential tasks.
We propose a novel IRL method, SMIRL, that first learns the (approximate) structure of a task as a finite-state-automaton (FSA), then uses the structural motif to solve the IRL problem.
arXiv Detail & Related papers (2022-09-25T18:34:59Z) - Individual-Level Inverse Reinforcement Learning for Mean Field Games [16.79251229846642]
Mean Field IRL (MFIRL) is the first dedicated IRL framework for MFGs that can handle both cooperative and non-cooperative environments.
We develop a practical algorithm effective for MFGs with unknown dynamics.
arXiv Detail & Related papers (2022-02-13T20:35:01Z) - Off-Dynamics Inverse Reinforcement Learning from Hetero-Domain [11.075036222901417]
We propose an approach for inverse reinforcement learning from hetero-domain which learns a reward function in the simulator, drawing on the demonstrations from the real world.
The intuition behind the method is that the reward function should not only be oriented to imitate the experts, but should encourage actions adjusted for the dynamics difference between the simulator and the real world.
arXiv Detail & Related papers (2021-10-21T19:23:15Z) - Cross-domain Imitation from Observations [50.669343548588294]
Imitation learning seeks to circumvent the difficulty in designing proper reward functions for training agents by utilizing expert behavior.
In this paper, we study the problem of how to imitate tasks when there exist discrepancies between the expert and agent MDP.
We present a novel framework to learn correspondences across such domains.
arXiv Detail & Related papers (2021-05-20T21:08:25Z) - Adversarial Inverse Reinforcement Learning for Mean Field Games [17.392418397388823]
Mean field games (MFGs) provide a mathematically tractable framework for modelling large-scale multi-agent systems.
This paper proposes a novel framework, Mean-Field Adversarial IRL (MF-AIRL), which is capable of tackling uncertainties in demonstrations.
arXiv Detail & Related papers (2021-04-29T21:03:49Z) - Demonstration-efficient Inverse Reinforcement Learning in Procedurally
Generated Environments [137.86426963572214]
Inverse Reinforcement Learning can extrapolate reward functions from expert demonstrations.
We show that our approach, DE-AIRL, is demonstration-efficient and still able to extrapolate reward functions which generalize to the fully procedural domain.
arXiv Detail & Related papers (2020-12-04T11:18:02Z) - Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning [12.76337275628074]
In this work, we propose a variational dynamic model based on the conditional variational inference to model the multimodality andgenerativeity.
We derive an upper bound of the negative log-likelihood of the environmental transition and use such an upper bound as the intrinsic reward for exploration.
Our method outperforms several state-of-the-art environment model-based exploration approaches.
arXiv Detail & Related papers (2020-10-17T09:54:51Z) - Reinforcement Learning for Mean Field Games with Strategic
Complementarities [10.281006908092932]
We introduce a natural refinement to the equilibrium concept that we call Trembling-Hand-Perfect MFE (T-MFE)
We propose a simple algorithm for computing T-MFE under a known model.
We also introduce a model-free and a model-based approach to learning T-MFE and provide sample complexities of both algorithms.
arXiv Detail & Related papers (2020-06-21T00:31:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.