Building a Foundation for Data-Driven, Interpretable, and Robust Policy
  Design using the AI Economist
        - URL: http://arxiv.org/abs/2108.02904v1
- Date: Fri, 6 Aug 2021 01:30:41 GMT
- Title: Building a Foundation for Data-Driven, Interpretable, and Robust Policy
  Design using the AI Economist
- Authors: Alexander Trott, Sunil Srinivasa, Douwe van der Wal, Sebastien
  Haneuse, Stephan Zheng
- Abstract summary: We show that the AI Economist framework enables effective, flexible, and interpretable policy design using two-level reinforcement learning and data-driven simulations.
We find that log-linear policies trained using RL significantly improve social welfare, based on both public health and economic outcomes, compared to past outcomes.
- Score: 67.08543240320756
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Optimizing economic and public policy is critical to address socioeconomic
issues and trade-offs, e.g., improving equality, productivity, or wellness, and
poses a complex mechanism design problem. A policy designer needs to consider
multiple objectives, policy levers, and behavioral responses from strategic
actors who optimize for their individual objectives. Moreover, real-world
policies should be explainable and robust to simulation-to-reality gaps, e.g.,
due to calibration issues. Existing approaches are often limited to a narrow
set of policy levers or objectives that are hard to measure, do not yield
explicit optimal policies, or do not consider strategic behavior, for example.
Hence, it remains challenging to optimize policy in real-world scenarios. Here
we show that the AI Economist framework enables effective, flexible, and
interpretable policy design using two-level reinforcement learning (RL) and
data-driven simulations. We validate our framework on optimizing the stringency
of US state policies and Federal subsidies during a pandemic, e.g., COVID-19,
using a simulation fitted to real data. We find that log-linear policies
trained using RL significantly improve social welfare, based on both public
health and economic outcomes, compared to past outcomes. Their behavior can be
explained, e.g., well-performing policies respond strongly to changes in
recovery and vaccination rates. They are also robust to calibration errors,
e.g., infection rates that are over or underestimated. As of yet, real-world
policymaking has not seen adoption of machine learning methods at large,
including RL and AI-driven simulations. Our results show the potential of AI to
guide policy design and improve social welfare amidst the complexity of the
real world.
 
      
        Related papers
        - EXPO: Stable Reinforcement Learning with Expressive Policies [74.30151915786233]
 We propose a sample-efficient online reinforcement learning algorithm to maximize value with two parameterized policies.<n>Our approach yields up to 2-3x improvement in sample efficiency on average over prior methods.
 arXiv  Detail & Related papers  (2025-07-10T17:57:46Z)
- Offline Robotic World Model: Learning Robotic Policies without a Physics   Simulator [50.191655141020505]
 Reinforcement Learning (RL) has demonstrated impressive capabilities in robotic control but remains challenging due to high sample complexity, safety concerns, and the sim-to-real gap.
We introduce Offline Robotic World Model (RWM-O), a model-based approach that explicitly estimates uncertainty to improve policy learning without reliance on a physics simulator.
 arXiv  Detail & Related papers  (2025-04-23T12:58:15Z)
- Navigating the Social Welfare Frontier: Portfolios for Multi-objective   Reinforcement Learning [29.937261596364472]
 We study the concept of an $alpha$-approximate portfolio in reinforcement learning (RL)
We provide theoretical guarantees on the trade-offs among approximation factor, portfolio size, and computational efficiency.
 Experimental results on synthetic and real-world datasets demonstrate the effectiveness of our approach.
 arXiv  Detail & Related papers  (2025-02-13T19:13:55Z)
- Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class   and Backbone [72.17534881026995]
 We develop an offline and online fine-tuning approach called policy-agnostic RL (PA-RL)
We show the first result that successfully fine-tunes OpenVLA, a 7B generalist robot policy, autonomously with Cal-QL, an online RL fine-tuning algorithm.
 arXiv  Detail & Related papers  (2024-12-09T17:28:03Z)
- Evaluating Real-World Robot Manipulation Policies in Simulation [91.55267186958892]
 Control and visual disparities between real and simulated environments are key challenges for reliable simulated evaluation.
We propose approaches for mitigating these gaps without needing to craft full-fidelity digital twins of real-world environments.
We create SIMPLER, a collection of simulated environments for manipulation policy evaluation on common real robot setups.
 arXiv  Detail & Related papers  (2024-05-09T17:30:16Z)
- Non-linear Welfare-Aware Strategic Learning [10.448052192725168]
 This paper studies algorithmic decision-making in the presence of strategic individual behaviors.
We first generalize the agent best response model in previous works to the non-linear setting.
We show the three welfare can attain the optimum simultaneously only under restrictive conditions.
 arXiv  Detail & Related papers  (2024-05-03T01:50:03Z)
- Learning Macroeconomic Policies through Dynamic Stackelberg Mean-Field   Games [14.341143540616441]
 We formulate a dynamic Stackelberg game: the government (leader) sets policies, and agents (followers) respond by optimizing their behavior over time.<n>As the number of agents increases, explicitly simulating all agent-agent and agent-government interactions becomes computationally infeasible.<n>We propose the Dynamic Stackelberg Mean Field Game framework, which approximates these complex interactions via agent-population and government-population couplings.
 arXiv  Detail & Related papers  (2024-03-14T13:22:31Z)
- Can Reinforcement Learning support policy makers? A preliminary study
  with Integrated Assessment Models [7.1307809008103735]
 Integrated Assessment Models (IAMs) attempt to link main features of society and economy with the biosphere into one modelling framework.
This paper empirically demonstrates that modern Reinforcement Learning can be used to probe IAMs and explore the space of solutions in a more principled manner.
 arXiv  Detail & Related papers  (2023-12-11T17:04:30Z)
- Marginalized Importance Sampling for Off-Environment Policy Evaluation [13.824507564510503]
 Reinforcement Learning (RL) methods are typically sample-inefficient, making it challenging to train and deploy RL-policies in real world robots.
This paper proposes a new approach to evaluate the real-world performance of agent policies prior to deploying them in the real world.
Our approach incorporates a simulator along with real-world offline data to evaluate the performance of any policy.
 arXiv  Detail & Related papers  (2023-09-04T20:52:04Z)
- Policy learning "without" overlap: Pessimism and generalized empirical   Bernstein's inequality [94.89246810243053]
 This paper studies offline policy learning, which aims at utilizing observations collected a priori to learn an optimal individualized decision rule.
Existing policy learning methods rely on a uniform overlap assumption, i.e., the propensities of exploring all actions for all individual characteristics must be lower bounded.
We propose Pessimistic Policy Learning (PPL), a new algorithm that optimize lower confidence bounds (LCBs) instead of point estimates.
 arXiv  Detail & Related papers  (2022-12-19T22:43:08Z)
- COptiDICE: Offline Constrained Reinforcement Learning via Stationary
  Distribution Correction Estimation [73.17078343706909]
 offline constrained reinforcement learning (RL) problem, in which the agent aims to compute a policy that maximizes expected return while satisfying given cost constraints, learning only from a pre-collected dataset.
We present an offline constrained RL algorithm that optimize the policy in the space of the stationary distribution.
Our algorithm, COptiDICE, directly estimates the stationary distribution corrections of the optimal policy with respect to returns, while constraining the cost upper bound, with the goal of yielding a cost-conservative policy for actual constraint satisfaction.
 arXiv  Detail & Related papers  (2022-04-19T15:55:47Z)
- The AI Economist: Optimal Economic Policy Design via Two-level Deep
  Reinforcement Learning [126.37520136341094]
 We show that machine-learning-based economic simulation is a powerful policy and mechanism design framework.
The AI Economist is a two-level, deep RL framework that trains both agents and a social planner who co-adapt.
In simple one-step economies, the AI Economist recovers the optimal tax policy of economic theory.
 arXiv  Detail & Related papers  (2021-08-05T17:42:35Z)
- Reinforcement Learning for Optimization of COVID-19 Mitigation policies [29.4529156655747]
 The year 2020 has seen the COVID-19 virus lead to one of the worst global pandemics in history.
Governments around the world are faced with the challenge of protecting public health, while keeping the economy running to the greatest extent possible.
Epidemiological models provide insight into the spread of these types of diseases and predict the effects of possible intervention policies.
 arXiv  Detail & Related papers  (2020-10-20T18:40:15Z)
- The AI Economist: Improving Equality and Productivity with AI-Driven Tax
  Policies [119.07163415116686]
 We train social planners that discover tax policies that can effectively trade-off economic equality and productivity.
We present an economic simulation environment that features competitive pressures and market dynamics.
We show that AI-driven tax policies improve the trade-off between equality and productivity by 16% over baseline policies.
 arXiv  Detail & Related papers  (2020-04-28T06:57:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.