Related papers: Reinforcement Learning for Monetary Policy Under Macroeconomic Uncertainty: Analyzing Tabular and Function Approximation Methods

Reinforcement Learning for Monetary Policy Under Macroeconomic Uncertainty: Analyzing Tabular and Function Approximation Methods

URL: http://arxiv.org/abs/2512.17929v1
Date: Tue, 09 Dec 2025 19:27:47 GMT
Title: Reinforcement Learning for Monetary Policy Under Macroeconomic Uncertainty: Analyzing Tabular and Function Approximation Methods
Authors: Sheryl Chen, Tony Wang, Kyle Feinstein,
Abstract summary: We study how a central bank should dynamically set short-term nominal interest rates to stabilize inflation and unemployment.<n>We construct a linear-Gaussian transition model and implement a discrete-action Markov Decision Process.<n>We compare nine different reinforcement learning style approaches against Taylor Rule and naive baselines.
Score: 0.5247013475817127
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study how a central bank should dynamically set short-term nominal interest rates to stabilize inflation and unemployment when macroeconomic relationships are uncertain and time-varying. We model monetary policy as a sequential decision-making problem where the central bank observes macroeconomic conditions quarterly and chooses interest rate adjustments. Using publically accessible historical Federal Reserve Economic Data (FRED), we construct a linear-Gaussian transition model and implement a discrete-action Markov Decision Process with a quadratic loss reward function. We chose to compare nine different reinforcement learning style approaches against Taylor Rule and naive baselines, including tabular Q-learning variants, SARSA, Actor-Critic, Deep Q-Networks, Bayesian Q-learning with uncertainty quantification, and POMDP formulations with partial observability. Surprisingly, standard tabular Q-learning achieved the best performance (-615.13 +- 309.58 mean return), outperforming both enhanced RL methods and traditional policy rules. Our results suggest that while sophisticated RL techniques show promise for monetary policy applications, simpler approaches may be more robust in this domain, highlighting important challenges in applying modern RL to macroeconomic policy.

Related papers

Bayesian Robust Financial Trading with Adversarial Synthetic Market Data [15.993346478707686]
Algorithmic trading relies on machine learning models to make trading decisions.<n>Despite strong in-sample performance, these models often degrade when confronted with evolving real-world market regimes.<n>We propose a Bayesian Robust Framework that integrates a macro-conditioned generative model with robust policy learning.
arXiv Detail & Related papers (2026-01-14T13:15:46Z)
Financial Decision Making using Reinforcement Learning with Dirichlet Priors and Quantum-Inspired Genetic Optimization [0.0]
This study proposes a reinforcement learning framework for dynamic budget allocation.<n>It is enhanced with Dirichlet-inspiredity and quantum mutation-based genetic optimization.<n>On unseen fiscal data, the model achieves high alignment with actual allocations.
arXiv Detail & Related papers (2025-08-27T15:34:21Z)
Can We Reliably Predict the Fed's Next Move? A Multi-Modal Approach to U.S. Monetary Policy Forecasting [2.6396287656676733]
This study examines whether predictive accuracy can be enhanced by integrating structured data with unstructured textual signals from Federal Reserve communications.<n>Our results show that hybrid models consistently outperform unimodal baselines.<n>For monetary policy forecasting, simpler hybrid models can offer both accuracy and interpretability, delivering actionable insights for researchers and decision-makers.
arXiv Detail & Related papers (2025-06-28T05:54:58Z)
Model-Free Robust $φ$-Divergence Reinforcement Learning Using Both Offline and Online Data [16.995406965407003]
We propose a model-free algorithm called Robust $phi$-regularized fitted Q-iteration (RPQ) for learning an $epsilon$-optimal robust policy. We also introduce the hybrid robust $phi$-regularized reinforcement learning framework to learn an optimal robust policy using both historical data and online sampling.
arXiv Detail & Related papers (2024-05-08T23:52:37Z)
Data-Driven Merton's Strategies via Policy Randomization [11.774563966512709]
We study Merton's expected utility problem in an incomplete market.<n>Agent under consideration is a price taker who has access only to the stock and factor value processes.<n>We propose an auxiliary problem in which the agent can invoke policy randomization according to a specific class of distributions.
arXiv Detail & Related papers (2023-12-19T02:14:13Z)
Projected Off-Policy Q-Learning (POP-QL) for Stabilizing Offline Reinforcement Learning [57.83919813698673]
Projected Off-Policy Q-Learning (POP-QL) is a novel actor-critic algorithm that simultaneously reweights off-policy samples and constrains the policy to prevent divergence and reduce value-approximation error. In our experiments, POP-QL not only shows competitive performance on standard benchmarks, but also out-performs competing methods in tasks where the data-collection policy is significantly sub-optimal.
arXiv Detail & Related papers (2023-11-25T00:30:58Z)
Offline Reinforcement Learning with On-Policy Q-Function Regularization [57.09073809901382]
We deal with the (potentially catastrophic) extrapolation error induced by the distribution shift between the history dataset and the desired policy. We propose two algorithms taking advantage of the estimated Q-function through regularizations, and demonstrate they exhibit strong performance on the D4RL benchmarks.
arXiv Detail & Related papers (2023-07-25T21:38:08Z)
Policy Gradient for Rectangular Robust Markov Decision Processes [62.397882389472564]
We introduce robust policy gradient (RPG), a policy-based method that efficiently solves rectangular robust Markov decision processes (MDPs) Our resulting RPG can be estimated from data with the same time complexity as its non-robust equivalent.
arXiv Detail & Related papers (2023-01-31T12:40:50Z)
Offline Reinforcement Learning with Implicit Q-Learning [85.62618088890787]
Current offline reinforcement learning methods need to query the value of unseen actions during training to improve the policy. We propose an offline RL method that never needs to evaluate actions outside of the dataset. This method enables the learned policy to improve substantially over the best behavior in the data through generalization.
arXiv Detail & Related papers (2021-10-12T17:05:05Z)
Building a Foundation for Data-Driven, Interpretable, and Robust Policy Design using the AI Economist [67.08543240320756]
We show that the AI Economist framework enables effective, flexible, and interpretable policy design using two-level reinforcement learning and data-driven simulations. We find that log-linear policies trained using RL significantly improve social welfare, based on both public health and economic outcomes, compared to past outcomes.
arXiv Detail & Related papers (2021-08-06T01:30:41Z)
MPC-based Reinforcement Learning for Economic Problems with Application to Battery Storage [0.0]
We focus on policy approximations based on Model Predictive Control (MPC) We observe that the policy gradient method can struggle to produce meaningful steps in the policy parameters when the policy has a (nearly) bang-bang structure. We propose a homotopy strategy based on the interior-point method, providing a relaxation of the policy during the learning.
arXiv Detail & Related papers (2021-04-06T10:37:14Z)
Conservative Q-Learning for Offline Reinforcement Learning [106.05582605650932]
We show that CQL substantially outperforms existing offline RL methods, often learning policies that attain 2-5 times higher final return. We theoretically show that CQL produces a lower bound on the value of the current policy and that it can be incorporated into a policy learning procedure with theoretical improvement guarantees.
arXiv Detail & Related papers (2020-06-08T17:53:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.