Episodic Future Thinking Mechanism for Multi-agent Reinforcement Learning
- URL: http://arxiv.org/abs/2410.17373v1
- Date: Tue, 22 Oct 2024 19:12:42 GMT
- Title: Episodic Future Thinking Mechanism for Multi-agent Reinforcement Learning
- Authors: Dongsu Lee, Minhae Kwon,
- Abstract summary: We introduce an episodic future thinking (EFT) mechanism for a reinforcement learning (RL) agent.
We first develop a multi-character policy that captures diverse characters with an ensemble of heterogeneous policies.
Once the character is inferred, the agent predicts the upcoming actions of target agents and simulates the potential future scenario.
- Score: 2.992602379681373
- License:
- Abstract: Understanding cognitive processes in multi-agent interactions is a primary goal in cognitive science. It can guide the direction of artificial intelligence (AI) research toward social decision-making in multi-agent systems, which includes uncertainty from character heterogeneity. In this paper, we introduce an episodic future thinking (EFT) mechanism for a reinforcement learning (RL) agent, inspired by cognitive processes observed in animals. To enable future thinking functionality, we first develop a multi-character policy that captures diverse characters with an ensemble of heterogeneous policies. Here, the character of an agent is defined as a different weight combination on reward components, representing distinct behavioral preferences. The future thinking agent collects observation-action trajectories of the target agents and uses the pre-trained multi-character policy to infer their characters. Once the character is inferred, the agent predicts the upcoming actions of target agents and simulates the potential future scenario. This capability allows the agent to adaptively select the optimal action, considering the predicted future scenario in multi-agent interactions. To evaluate the proposed mechanism, we consider the multi-agent autonomous driving scenario with diverse driving traits and multiple particle environments. Simulation results demonstrate that the EFT mechanism with accurate character inference leads to a higher reward than existing multi-agent solutions. We also confirm that the effect of reward improvement remains valid across societies with different levels of character diversity.
Related papers
- Agent AI: Surveying the Horizons of Multimodal Interaction [83.18367129924997]
"Agent AI" is a class of interactive systems that can perceive visual stimuli, language inputs, and other environmentally-grounded data.
We envision a future where people can easily create any virtual reality or simulated scene and interact with agents embodied within the virtual environment.
arXiv Detail & Related papers (2024-01-07T19:11:18Z) - DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement
Learning [84.22561239481901]
We propose a new approach that enables agents to learn whether their behaviors should be consistent with that of other agents.
We evaluate DCIR in multiple environments including Multi-agent Particle, Google Research Football and StarCraft II Micromanagement.
arXiv Detail & Related papers (2023-12-10T06:03:57Z) - Theory of Mind as Intrinsic Motivation for Multi-Agent Reinforcement
Learning [5.314466196448188]
We present a method of grounding semantically meaningful, human-interpretable beliefs within policies modeled by deep networks.
We propose that ability of each agent to predict the beliefs of the other agents can be used as an intrinsic reward signal for multi-agent reinforcement learning.
arXiv Detail & Related papers (2023-07-03T17:07:18Z) - CAMMARL: Conformal Action Modeling in Multi Agent Reinforcement Learning [5.865719902445064]
We propose a novel multi-agent reinforcement learning algorithm CAMMARL.
It involves modeling the actions of other agents in different situations in the form of confident sets.
We show that CAMMARL elevates the capabilities of an autonomous agent in MARL by modeling conformal prediction sets.
arXiv Detail & Related papers (2023-06-19T19:03:53Z) - Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent
RL [107.58821842920393]
We quantify the agent's behavior difference and build its relationship with the policy performance via bf Role Diversity
We find that the error bound in MARL can be decomposed into three parts that have a strong relation to the role diversity.
The decomposed factors can significantly impact policy optimization on three popular directions.
arXiv Detail & Related papers (2022-06-01T04:58:52Z) - Diversifying Agent's Behaviors in Interactive Decision Models [11.125175635860169]
Modelling other agents' behaviors plays an important role in decision models for interactions among multiple agents.
In this article, we investigate into diversifying behaviors of other agents in the subject agent's decision model prior to their interactions.
arXiv Detail & Related papers (2022-03-06T23:05:00Z) - Assessing Human Interaction in Virtual Reality With Continually Learning
Prediction Agents Based on Reinforcement Learning Algorithms: A Pilot Study [6.076137037890219]
We investigate how the interaction between a human and a continually learning prediction agent develops as the agent develops competency.
We develop a virtual reality environment and a time-based prediction task wherein learned predictions from a reinforcement learning (RL) algorithm augment human predictions.
Our findings suggest that human trust of the system may be influenced by early interactions with the agent, and that trust in turn affects strategic behaviour.
arXiv Detail & Related papers (2021-12-14T22:46:44Z) - Multi-Agent Imitation Learning with Copulas [102.27052968901894]
Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions.
In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems.
Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
arXiv Detail & Related papers (2021-07-10T03:49:41Z) - ERMAS: Becoming Robust to Reward Function Sim-to-Real Gaps in
Multi-Agent Simulations [110.72725220033983]
Epsilon-Robust Multi-Agent Simulation (ERMAS) is a framework for learning AI policies that are robust to such multiagent sim-to-real gaps.
ERMAS learns tax policies that are robust to changes in agent risk aversion, improving social welfare by up to 15% in complextemporal simulations.
In particular, ERMAS learns tax policies that are robust to changes in agent risk aversion, improving social welfare by up to 15% in complextemporal simulations.
arXiv Detail & Related papers (2021-06-10T04:32:20Z) - Learning Latent Representations to Influence Multi-Agent Interaction [65.44092264843538]
We propose a reinforcement learning-based framework for learning latent representations of an agent's policy.
We show that our approach outperforms the alternatives and learns to influence the other agent.
arXiv Detail & Related papers (2020-11-12T19:04:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.