Dynamic Noises of Multi-Agent Environments Can Improve Generalization:
Agent-based Models meets Reinforcement Learning
- URL: http://arxiv.org/abs/2204.14076v1
- Date: Sat, 26 Mar 2022 09:56:30 GMT
- Title: Dynamic Noises of Multi-Agent Environments Can Improve Generalization:
Agent-based Models meets Reinforcement Learning
- Authors: Mohamed Akrout, Amal Feriani, Bob McLeod
- Abstract summary: We study the benefits of reinforcement learning environments based on agent-based models (ABM)
We show that their non-deterministic dynamics can improve the generalization of RL agents.
Numerical simulations demonstrate that the intrinsic noise in the ABM-based dynamics of the SIR model not only improve the average reward but also allow the RL agent to generalize on a wider ranges of epidemic parameters.
- Score: 2.492300648514128
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the benefits of reinforcement learning (RL) environments based on
agent-based models (ABM). While ABMs are known to offer microfoundational
simulations at the cost of computational complexity, we empirically show in
this work that their non-deterministic dynamics can improve the generalization
of RL agents. To this end, we examine the control of an epidemic SIR
environments based on either differential equations or ABMs. Numerical
simulations demonstrate that the intrinsic noise in the ABM-based dynamics of
the SIR model not only improve the average reward but also allow the RL agent
to generalize on a wider ranges of epidemic parameters.
Related papers
- Domain-driven Metrics for Reinforcement Learning: A Case Study on Epidemic Control using Agent-based Simulation [0.29360071145551064]
In this study, we are developing domain-driven metrics for RL, while building on state-of-the-art metrics.<n>Results show the use of domain-driven rewards in conjunction with traditional and state-of-the-art metrics for a few different simulation scenarios.
arXiv Detail & Related papers (2025-08-07T08:40:19Z) - Agentic Reinforced Policy Optimization [66.96989268893932]
Large-scale reinforcement learning with verifiable rewards (RLVR) has demonstrated its effectiveness in harnessing the potential of large language models (LLMs) for single-turn reasoning tasks.<n>Current RL algorithms inadequately balance the models' intrinsic long-horizon reasoning capabilities and their proficiency in multi-turn tool interactions.<n>We propose Agentic Reinforced Policy Optimization (ARPO), a novel agentic RL algorithm tailored for training multi-turn LLM-based agents.
arXiv Detail & Related papers (2025-07-26T07:53:11Z) - Learning Individual Behavior in Agent-Based Models with Graph Diffusion Networks [2.749593964424624]
Agent-Based Models (ABMs) are powerful tools for studying emergent properties in complex systems.<n>We propose a novel framework to learn a differentiable surrogate of any ABM by observing its generated data.<n>Our method combines diffusion models to capture behaviorality and graph neural networks to model agent interactions.
arXiv Detail & Related papers (2025-05-27T16:55:56Z) - Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining [74.83412846804977]
Reinforcement learning (RL)-based fine-tuning has become a crucial step in post-training language models.
We present a systematic end-to-end study of RL fine-tuning for mathematical reasoning by training models entirely from scratch.
arXiv Detail & Related papers (2025-04-10T17:15:53Z) - On the Modeling Capabilities of Large Language Models for Sequential Decision Making [52.128546842746246]
Large pretrained models are showing increasingly better performance in reasoning and planning tasks.
We evaluate their ability to produce decision-making policies, either directly, by generating actions, or indirectly.
In environments with unfamiliar dynamics, we explore how fine-tuning LLMs with synthetic data can significantly improve their reward modeling capabilities.
arXiv Detail & Related papers (2024-10-08T03:12:57Z) - On the limits of agency in agent-based models [13.130587222524305]
Agent-based modeling offers powerful insights into complex systems, but its practical utility has been limited by computational constraints.
Recent advancements in large language models (LLMs) could enhance ABMs with adaptive agents, but their integration into large-scale simulations remains challenging.
We present LLM archetypes, a technique that balances behavioral complexity with computational efficiency, allowing for nuanced agent behavior in large-scale simulations.
arXiv Detail & Related papers (2024-09-14T04:17:24Z) - Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation.
We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z) - Data-Efficient Task Generalization via Probabilistic Model-based Meta
Reinforcement Learning [58.575939354953526]
PACOH-RL is a novel model-based Meta-Reinforcement Learning (Meta-RL) algorithm designed to efficiently adapt control policies to changing dynamics.
Existing Meta-RL methods require abundant meta-learning data, limiting their applicability in settings such as robotics.
Our experiment results demonstrate that PACOH-RL outperforms model-based RL and model-based Meta-RL baselines in adapting to new dynamic conditions.
arXiv Detail & Related papers (2023-11-13T18:51:57Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - Modeling Bounded Rationality in Multi-Agent Simulations Using Rationally
Inattentive Reinforcement Learning [85.86440477005523]
We study more human-like RL agents which incorporate an established model of human-irrationality, the Rational Inattention (RI) model.
RIRL models the cost of cognitive information processing using mutual information.
We show that using RIRL yields a rich spectrum of new equilibrium behaviors that differ from those found under rational assumptions.
arXiv Detail & Related papers (2022-01-18T20:54:00Z) - KNODE-MPC: A Knowledge-based Data-driven Predictive Control Framework
for Aerial Robots [5.897728689802829]
We make use of a deep learning tool, knowledge-based neural ordinary differential equations (KNODE), to augment a model obtained from first principles.
The resulting hybrid model encompasses both a nominal first-principle model and a neural network learnt from simulated or real-world experimental data.
To improve closed-loop performance, the hybrid model is integrated into a novel MPC framework, known as KNODE-MPC.
arXiv Detail & Related papers (2021-09-10T12:09:18Z) - Non-Markovian Reinforcement Learning using Fractional Dynamics [3.000697999889031]
Reinforcement learning (RL) is a technique to learn the control policy for an agent that interacts with an environment.
In this paper, we propose a model-based RL technique for a system that has non-Markovian dynamics.
Such environments are common in many real-world applications such as in human physiology, biological systems, material science, and population dynamics.
arXiv Detail & Related papers (2021-07-29T07:35:13Z) - Policy-focused Agent-based Modeling using RL Behavioral Models [0.40498500266986387]
This paper examines the value of reinforcement learning models as adaptive, high-performing, and behaviorally-valid models of agent decision-making in ABMs.
We test the hypothesis that RL agents are effective as utility-maximizing agents in policy ABMs.
Experiments show that RL behavioral models are effective at producing reward-seeking or reward-maximizing behaviors in ABM agents.
arXiv Detail & Related papers (2020-06-09T04:55:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.