Learning and Calibrating Heterogeneous Bounded Rational Market Behaviour
with Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2402.00787v1
- Date: Thu, 1 Feb 2024 17:21:45 GMT
- Title: Learning and Calibrating Heterogeneous Bounded Rational Market Behaviour
with Multi-Agent Reinforcement Learning
- Authors: Benjamin Patrick Evans, Sumitra Ganesh
- Abstract summary: Agent-based models (ABMs) have shown promise for modelling various real world phenomena incompatible with traditional equilibrium analysis.
Recent developments in multi-agent reinforcement learning (MARL) offer a way to address this issue from a rationality perspective.
We propose a novel technique for representing heterogeneous processing-constrained agents within a MARL framework.
- Score: 4.40301653518681
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Agent-based models (ABMs) have shown promise for modelling various real world
phenomena incompatible with traditional equilibrium analysis. However, a
critical concern is the manual definition of behavioural rules in ABMs. Recent
developments in multi-agent reinforcement learning (MARL) offer a way to
address this issue from an optimisation perspective, where agents strive to
maximise their utility, eliminating the need for manual rule specification.
This learning-focused approach aligns with established economic and financial
models through the use of rational utility-maximising agents. However, this
representation departs from the fundamental motivation for ABMs: that realistic
dynamics emerging from bounded rationality and agent heterogeneity can be
modelled. To resolve this apparent disparity between the two approaches, we
propose a novel technique for representing heterogeneous processing-constrained
agents within a MARL framework. The proposed approach treats agents as
constrained optimisers with varying degrees of strategic skills, permitting
departure from strict utility maximisation. Behaviour is learnt through
repeated simulations with policy gradients to adjust action likelihoods. To
allow efficient computation, we use parameterised shared policy learning with
distributions of agent skill levels. Shared policy learning avoids the need for
agents to learn individual policies yet still enables a spectrum of bounded
rational behaviours. We validate our model's effectiveness using real-world
data on a range of canonical $n$-agent settings, demonstrating significantly
improved predictive capability.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - SAMBO-RL: Shifts-aware Model-based Offline Reinforcement Learning [9.88109749688605]
Model-based Offline Reinforcement Learning trains policies based on offline datasets and model dynamics.
This paper disentangles the problem into two key components: model bias and policy shift.
We introduce Shifts-aware Model-based Offline Reinforcement Learning (SAMBO-RL)
arXiv Detail & Related papers (2024-08-23T04:25:09Z) - Simulating the Economic Impact of Rationality through Reinforcement Learning and Agent-Based Modelling [1.7546137756031712]
We leverage multi-agent reinforcement learning (RL) to expand the capabilities of agent-based models (ABMs)
We show that RL agents spontaneously learn three distinct strategies for maximising profits, with the optimal strategy depending on the level of market competition and rationality.
We also find that RL agents with independent policies, and without the ability to communicate with each other, spontaneously learn to segregate into different strategic groups, thus increasing market power and overall profits.
arXiv Detail & Related papers (2024-05-03T15:08:25Z) - When Demonstrations Meet Generative World Models: A Maximum Likelihood
Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent.
Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - Multi-Agent Inverse Reinforcement Learning: Suboptimal Demonstrations
and Alternative Solution Concepts [0.0]
Multi-agent inverse reinforcement learning can be used to learn reward functions from agents in social environments.
To model realistic social dynamics, MIRL methods must account for suboptimal human reasoning and behavior.
arXiv Detail & Related papers (2021-09-02T19:15:29Z) - ERMAS: Becoming Robust to Reward Function Sim-to-Real Gaps in
Multi-Agent Simulations [110.72725220033983]
Epsilon-Robust Multi-Agent Simulation (ERMAS) is a framework for learning AI policies that are robust to such multiagent sim-to-real gaps.
ERMAS learns tax policies that are robust to changes in agent risk aversion, improving social welfare by up to 15% in complextemporal simulations.
In particular, ERMAS learns tax policies that are robust to changes in agent risk aversion, improving social welfare by up to 15% in complextemporal simulations.
arXiv Detail & Related papers (2021-06-10T04:32:20Z) - Deep Interactive Bayesian Reinforcement Learning via Meta-Learning [63.96201773395921]
The optimal adaptive behaviour under uncertainty over the other agents' strategies can be computed using the Interactive Bayesian Reinforcement Learning framework.
We propose to meta-learn approximate belief inference and Bayes-optimal behaviour for a given prior.
We show empirically that our approach outperforms existing methods that use a model-free approach, sample from the approximate posterior, maintain memory-free models of others, or do not fully utilise the known structure of the environment.
arXiv Detail & Related papers (2021-01-11T13:25:13Z) - On the model-based stochastic value gradient for continuous
reinforcement learning [50.085645237597056]
We show that simple model-based agents can outperform state-of-the-art model-free agents in terms of both sample-efficiency and final reward.
Our findings suggest that model-based policy evaluation deserves closer attention.
arXiv Detail & Related papers (2020-08-28T17:58:29Z) - Policy Evaluation and Seeking for Multi-Agent Reinforcement Learning via
Best Response [15.149039407681945]
We adopt strict best response dynamics to model selfish behaviors at a meta-level for multi-agent reinforcement learning.
Our approach is more compatible with single-agent reinforcement learning than alpha-rank which relies on weakly better responses.
arXiv Detail & Related papers (2020-06-17T01:17:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.