Learning Macroeconomic Policies through Dynamic Stackelberg Mean-Field Games
- URL: http://arxiv.org/abs/2403.12093v4
- Date: Sun, 01 Jun 2025 09:18:46 GMT
- Title: Learning Macroeconomic Policies through Dynamic Stackelberg Mean-Field Games
- Authors: Qirui Mi, Zhiyu Zhao, Chengdong Ma, Siyu Xia, Yan Song, Mengyue Yang, Jun Wang, Haifeng Zhang,
- Abstract summary: We formulate a dynamic Stackelberg game: the government (leader) sets policies, and agents (followers) respond by optimizing their behavior over time.<n>As the number of agents increases, explicitly simulating all agent-agent and agent-government interactions becomes computationally infeasible.<n>We propose the Dynamic Stackelberg Mean Field Game framework, which approximates these complex interactions via agent-population and government-population couplings.
- Score: 14.341143540616441
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Macroeconomic outcomes emerge from individuals' decisions, making it essential to model how agents interact with macro policy via consumption, investment, and labor choices. We formulate this as a dynamic Stackelberg game: the government (leader) sets policies, and agents (followers) respond by optimizing their behavior over time. Unlike static models, this dynamic formulation captures temporal dependencies and strategic feedback critical to policy design. However, as the number of agents increases, explicitly simulating all agent-agent and agent-government interactions becomes computationally infeasible. To address this, we propose the Dynamic Stackelberg Mean Field Game (DSMFG) framework, which approximates these complex interactions via agent-population and government-population couplings. This approximation preserves individual-level feedback while ensuring scalability, enabling DSMFG to jointly model three core features of real-world policymaking: dynamic feedback, asymmetry, and large scale. We further introduce Stackelberg Mean Field Reinforcement Learning (SMFRL), a data-driven algorithm that learns the leader's optimal policies while maintaining personalized responses for individual agents. Empirically, we validate our approach in a large-scale simulated economy, where it scales to 1,000 agents (vs. 100 in prior work) and achieves a fourfold increase in GDP over classical economic methods and a nineteenfold improvement over the static 2022 U.S. federal income tax policy.
Related papers
- Action Dependency Graphs for Globally Optimal Coordinated Reinforcement Learning [0.0]
Action-dependent individual policies have emerged as a promising paradigm for achieving global optimality in multi-agent reinforcement learning.<n>In this work, we consider a more generalized class of action-dependent policies, which do not necessarily follow the auto-regressive form.<n>Within the context of MARL problems structured by coordination graphs, we prove that an action-dependent policy with a sparse ADG can achieve global optimality.
arXiv Detail & Related papers (2025-06-01T02:58:20Z) - AgentRM: Enhancing Agent Generalization with Reward Modeling [78.52623118224385]
We find that finetuning a reward model to guide the policy model is more robust than directly finetuning the policy model.<n>We propose AgentRM, a generalizable reward model, to guide the policy model for effective test-time search.
arXiv Detail & Related papers (2025-02-25T17:58:02Z) - STEER-ME: Assessing the Microeconomic Reasoning of Large Language Models [8.60556939977361]
We develop a benchmark for evaluating large language models (LLM) for microeconomic reasoning.
We focus on the logic of supply and demand, each grounded in up to $10$ domains, $5$ perspectives, and $3$ types.
We demonstrate the usefulness of our benchmark via a case study on $27$ LLMs, ranging from small open-source models to the current state of the art.
arXiv Detail & Related papers (2025-02-18T18:42:09Z) - A Multi-agent Market Model Can Explain the Impact of AI Traders in Financial Markets -- A New Microfoundations of GARCH model [3.655221783356311]
We propose a multi-agent market model to derive the microfoundations of the GARCH model, incorporating three types of agents: noise traders, fundamental traders, and AI traders.
We validate this model through multi-agent simulations, confirming its ability to reproduce the stylized facts of financial markets.
arXiv Detail & Related papers (2024-09-19T07:14:13Z) - Evaluating Real-World Robot Manipulation Policies in Simulation [91.55267186958892]
Control and visual disparities between real and simulated environments are key challenges for reliable simulated evaluation.
We propose approaches for mitigating these gaps without needing to craft full-fidelity digital twins of real-world environments.
We create SIMPLER, a collection of simulated environments for manipulation policy evaluation on common real robot setups.
arXiv Detail & Related papers (2024-05-09T17:30:16Z) - Simulating the Economic Impact of Rationality through Reinforcement Learning and Agent-Based Modelling [1.7546137756031712]
We leverage multi-agent reinforcement learning (RL) to expand the capabilities of agent-based models (ABMs)
We show that RL agents spontaneously learn three distinct strategies for maximising profits, with the optimal strategy depending on the level of market competition and rationality.
We also find that RL agents with independent policies, and without the ability to communicate with each other, spontaneously learn to segregate into different strategic groups, thus increasing market power and overall profits.
arXiv Detail & Related papers (2024-05-03T15:08:25Z) - Blending Data-Driven Priors in Dynamic Games [9.085463548798366]
We formulate an algorithm for solving non-cooperative dynamic game with Kullback-Leibler (KL) regularization.
We propose an efficient algorithm for computing multi-modal approximate feedback Nash equilibrium strategies of KLGame in real time.
arXiv Detail & Related papers (2024-02-21T23:22:32Z) - Structured Dynamic Pricing: Optimal Regret in a Global Shrinkage Model [50.06663781566795]
We consider a dynamic model with the consumers' preferences as well as price sensitivity varying over time.
We measure the performance of a dynamic pricing policy via regret, which is the expected revenue loss compared to a clairvoyant that knows the sequence of model parameters in advance.
Our regret analysis results not only demonstrate optimality of the proposed policy but also show that for policy planning it is essential to incorporate available structural information.
arXiv Detail & Related papers (2023-03-28T00:23:23Z) - Finding Regularized Competitive Equilibria of Heterogeneous Agent
Macroeconomic Models with Reinforcement Learning [151.03738099494765]
We study a heterogeneous agent macroeconomic model with an infinite number of households and firms competing in a labor market.
We propose a data-driven reinforcement learning framework that finds the regularized competitive equilibrium of the model.
arXiv Detail & Related papers (2023-02-24T17:16:27Z) - Towards a more efficient computation of individual attribute and policy
contribution for post-hoc explanation of cooperative multi-agent systems
using Myerson values [0.0]
A quantitative assessment of the global importance of an agent in a team is as valuable as gold for strategists, decision-makers, and sports coaches.
We propose a method to determine a Hierarchical Knowledge Graph of agents' policies and features in a Multi-Agent System.
We test the proposed approach in a proof-of-case environment deploying both hardcoded policies and policies obtained via Deep Reinforcement Learning.
arXiv Detail & Related papers (2022-12-06T15:15:00Z) - Latent State Marginalization as a Low-cost Approach for Improving
Exploration [79.12247903178934]
We propose the adoption of latent variable policies within the MaxEnt framework.
We show that latent variable policies naturally emerges under the use of world models with a latent belief state.
We experimentally validate our method on continuous control tasks, showing that effective marginalization can lead to better exploration and more robust training.
arXiv Detail & Related papers (2022-10-03T15:09:12Z) - Weak Supervision in Analysis of News: Application to Economic Policy
Uncertainty [0.0]
Our work focuses on studying the potential of textual data, in particular news pieces, for measuring economic policy uncertainty (EPU)
Economic policy uncertainty is defined as the public's inability to predict the outcomes of their decisions under new policies and future economic fundamentals.
Our work proposes a machine learning based solution involving weak supervision to classify news articles with regards to economic policy uncertainty.
arXiv Detail & Related papers (2022-08-10T09:08:29Z) - Finding General Equilibria in Many-Agent Economic Simulations Using Deep
Reinforcement Learning [72.23843557783533]
We show that deep reinforcement learning can discover stable solutions that are epsilon-Nash equilibria for a meta-game over agent types.
Our approach is more flexible and does not need unrealistic assumptions, e.g., market clearing.
We demonstrate our approach in real-business-cycle models, a representative family of DGE models, with 100 worker-consumers, 10 firms, and a government who taxes and redistributes.
arXiv Detail & Related papers (2022-01-03T17:00:17Z) - Building a Foundation for Data-Driven, Interpretable, and Robust Policy
Design using the AI Economist [67.08543240320756]
We show that the AI Economist framework enables effective, flexible, and interpretable policy design using two-level reinforcement learning and data-driven simulations.
We find that log-linear policies trained using RL significantly improve social welfare, based on both public health and economic outcomes, compared to past outcomes.
arXiv Detail & Related papers (2021-08-06T01:30:41Z) - The AI Economist: Optimal Economic Policy Design via Two-level Deep
Reinforcement Learning [126.37520136341094]
We show that machine-learning-based economic simulation is a powerful policy and mechanism design framework.
The AI Economist is a two-level, deep RL framework that trains both agents and a social planner who co-adapt.
In simple one-step economies, the AI Economist recovers the optimal tax policy of economic theory.
arXiv Detail & Related papers (2021-08-05T17:42:35Z) - ERMAS: Becoming Robust to Reward Function Sim-to-Real Gaps in
Multi-Agent Simulations [110.72725220033983]
Epsilon-Robust Multi-Agent Simulation (ERMAS) is a framework for learning AI policies that are robust to such multiagent sim-to-real gaps.
ERMAS learns tax policies that are robust to changes in agent risk aversion, improving social welfare by up to 15% in complextemporal simulations.
In particular, ERMAS learns tax policies that are robust to changes in agent risk aversion, improving social welfare by up to 15% in complextemporal simulations.
arXiv Detail & Related papers (2021-06-10T04:32:20Z) - MPC-based Reinforcement Learning for Economic Problems with Application
to Battery Storage [0.0]
We focus on policy approximations based on Model Predictive Control (MPC)
We observe that the policy gradient method can struggle to produce meaningful steps in the policy parameters when the policy has a (nearly) bang-bang structure.
We propose a homotopy strategy based on the interior-point method, providing a relaxation of the policy during the learning.
arXiv Detail & Related papers (2021-04-06T10:37:14Z) - Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field
Control/Game in Continuous Time [109.06623773924737]
We study the policy gradient method for the linear-quadratic mean-field control and game.
We show that it converges to the optimal solution at a linear rate, which is verified by a synthetic simulation.
arXiv Detail & Related papers (2020-08-16T06:34:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.