DIML: Differentiable Inverse Mechanism Learning from Behaviors of Multi-Agent Learning Trajectories
- URL: http://arxiv.org/abs/2601.17678v1
- Date: Sun, 25 Jan 2026 03:49:25 GMT
- Title: DIML: Differentiable Inverse Mechanism Learning from Behaviors of Multi-Agent Learning Trajectories
- Authors: Zhiyu An, Wan Du,
- Abstract summary: We study inverse mechanism learning: recovering an unknown incentive-generating mechanism from observed strategic interaction traces.<n>Unlike inverse game theory and multi-agent inverse reinforcement learning, our target includes unstructured mechanism.<n>We propose DIML, a likelihood-based framework that differentiates through a model of multi-agent learning dynamics.
- Score: 7.764532811300023
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study inverse mechanism learning: recovering an unknown incentive-generating mechanism from observed strategic interaction traces of self-interested learning agents. Unlike inverse game theory and multi-agent inverse reinforcement learning, which typically infer utility/reward parameters inside a structured mechanism, our target includes unstructured mechanism -- a (possibly neural) mapping from joint actions to per-agent payoffs. Unlike differentiable mechanism design, which optimizes mechanisms forward, we infer mechanisms from behavior in an observational setting. We propose DIML, a likelihood-based framework that differentiates through a model of multi-agent learning dynamics and uses the candidate mechanism to generate counterfactual payoffs needed to predict observed actions. We establish identifiability of payoff differences under a conditional logit response model and prove statistical consistency of maximum likelihood estimation under standard regularity conditions. We evaluate DIML with simulated interactions of learning agents across unstructured neural mechanisms, congestion tolling, public goods subsidies, and large-scale anonymous games. DIML reliably recovers identifiable incentive differences and supports counterfactual prediction, where its performance rivals tabular enumeration oracle in small environments and its convergence scales to large, hundred-participant environments. Code to reproduce our experiments is open-sourced.
Related papers
- Emergence from Emergence: Financial Market Simulation via Learning with Heterogeneous Preferences [3.722808691920657]
We develop a multi-agent reinforcement learning framework in which agents endowed with heterogeneous risk aversion, time discounting, and information access collectively learn trading strategies.<n>The experiment reveals that (i) learning with heterogeneous preferences drives agents to develop strategies aligned with their individual traits, fostering behavioral differentiation and niche specialization within the market, and (ii) the interactions by the differentiated agents are essential for the emergence of realistic market dynamics.
arXiv Detail & Related papers (2025-11-07T12:54:27Z) - Social World Model-Augmented Mechanism Design Policy Learning [58.739456918502704]
We introduce SWM-AP (Social World Model-Augmented Mechanism Design Policy Learning), which learns a social world model hierarchically to enhance mechanism design.<n>We show that SWM-AP outperforms established model-based and model-free RL baselines in cumulative rewards and sample efficiency.
arXiv Detail & Related papers (2025-10-22T06:01:21Z) - Beyond Benchmarks: Understanding Mixture-of-Experts Models through Internal Mechanisms [55.1784306456972]
Mixture-of-Experts (MoE) architectures have emerged as a promising direction, offering efficiency and scalability by activating only a subset of parameters during inference.<n>We use an internal metric to investigate the mechanisms of MoE architecture by explicitly incorporating routing mechanisms and analyzing expert-level behaviors.<n>We uncover several findings: (1) neuron utilization decreases as models evolve, reflecting stronger generalization; (2) training exhibits a dynamic trajectory, where benchmark performance alone provides limited signal; (3) task completion emerges from collaborative contributions of multiple experts, with shared experts driving concentration; and (4) activation patterns at the neuron level provide a fine-grained proxy for data diversity.
arXiv Detail & Related papers (2025-09-28T15:13:38Z) - Large Language Models for Multi-Facility Location Mechanism Design [16.88708405619343]
Deep learning models have been proposed as alternatives to strategyproof mechanisms for multi-facility location.<n>We introduce a novel approach, named LLMMech, that addresses these limitations by incorporating large language models into an evolutionary framework.<n>Our experimental results, evaluated on various problem settings, demonstrate that the LLM-generated mechanisms generally outperform existing handcrafted baselines and deep learning models.
arXiv Detail & Related papers (2025-03-12T16:49:56Z) - Learning Neural Strategy-Proof Matching Mechanism from Examples [24.15688619889342]
We propose a new family of matching mechanisms that always satisfy strategy-proofness, are applicable for an arbitrary number of agents, and deal with public contextual information of agents.<n>We conducted experiments to learn a matching mechanism from matching examples while satisfying strategy-proofness.<n>We demonstrated that our method outperformed baselines in predicting matchings and on several metrics for goodness of matching outcomes.
arXiv Detail & Related papers (2024-10-25T08:34:25Z) - Compete and Compose: Learning Independent Mechanisms for Modular World Models [57.94106862271727]
We present COMET, a modular world model which leverages reusable, independent mechanisms across different environments.
COMET is trained on multiple environments with varying dynamics via a two-step process: competition and composition.
We show that COMET is able to adapt to new environments with varying numbers of objects with improved sample efficiency compared to more conventional finetuning approaches.
arXiv Detail & Related papers (2024-04-23T15:03:37Z) - Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline
Reinforcement Learning [114.36124979578896]
We design a dynamic mechanism using offline reinforcement learning algorithms.
Our algorithm is based on the pessimism principle and only requires a mild assumption on the coverage of the offline data set.
arXiv Detail & Related papers (2022-05-05T05:44:26Z) - Properties from Mechanisms: An Equivariance Perspective on Identifiable
Representation Learning [79.4957965474334]
Key goal of unsupervised representation learning is "inverting" a data generating process to recover its latent properties.
This paper asks, "Can we instead identify latent properties by leveraging knowledge of the mechanisms that govern their evolution?"
We provide a complete characterization of the sources of non-identifiability as we vary knowledge about a set of possible mechanisms.
arXiv Detail & Related papers (2021-10-29T14:04:08Z) - Multi-Agent Imitation Learning with Copulas [102.27052968901894]
Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions.
In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems.
Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
arXiv Detail & Related papers (2021-07-10T03:49:41Z) - Learning Robust Models Using The Principle of Independent Causal
Mechanisms [26.79262903241044]
We propose a new gradient-based learning framework whose objective function is derived from the ICM principle.
We show theoretically and experimentally that neural networks trained in this framework focus on relations remaining invariant across environments.
arXiv Detail & Related papers (2020-10-14T15:38:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.