Contrastive learning-based agent modeling for deep reinforcement
learning
- URL: http://arxiv.org/abs/2401.00132v2
- Date: Thu, 18 Jan 2024 10:06:29 GMT
- Title: Contrastive learning-based agent modeling for deep reinforcement
learning
- Authors: Wenhao Ma, Yu-Cheng Chang, Jie Yang, Yu-Kai Wang, Chin-Teng Lin
- Abstract summary: Agent modeling is essential when designing adaptive policies for intelligent machine agents in multiagent systems.
We devised a Contrastive Learning-based Agent Modeling (CLAM) method that relies only on the local observations from the ego agent during training and execution.
CLAM is capable of generating consistent high-quality policy representations in real-time right from the beginning of each episode.
- Score: 31.293496061727932
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Multi-agent systems often require agents to collaborate with or compete
against other agents with diverse goals, behaviors, or strategies. Agent
modeling is essential when designing adaptive policies for intelligent machine
agents in multiagent systems, as this is the means by which the ego agent
understands other agents' behavior and extracts their meaningful policy
representations. These representations can be used to enhance the ego agent's
adaptive policy which is trained by reinforcement learning. However, existing
agent modeling approaches typically assume the availability of local
observations from other agents (modeled agents) during training or a long
observation trajectory for policy adaption. To remove these constrictive
assumptions and improve agent modeling performance, we devised a Contrastive
Learning-based Agent Modeling (CLAM) method that relies only on the local
observations from the ego agent during training and execution. With these
observations, CLAM is capable of generating consistent high-quality policy
representations in real-time right from the beginning of each episode. We
evaluated the efficacy of our approach in both cooperative and competitive
multi-agent environments. Our experiments demonstrate that our approach
achieves state-of-the-art on both cooperative and competitive tasks,
highlighting the potential of contrastive learning-based agent modeling for
enhancing reinforcement learning.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Fact-based Agent modeling for Multi-Agent Reinforcement Learning [6.431977627644292]
Fact-based Agent modeling (FAM) method is proposed in which fact-based belief inference (FBI) network models other agents in partially observable environment only based on its local information.
We evaluate FAM on various Multiagent Particle Environment (MPE) and compare the results with several state-of-the-art MARL algorithms.
arXiv Detail & Related papers (2023-10-18T19:43:38Z) - ProAgent: Building Proactive Cooperative Agents with Large Language
Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents.
ProAgent can analyze the present state, and infer the intentions of teammates from observations.
ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z) - Multi-agent Actor-Critic with Time Dynamical Opponent Model [16.820873906787906]
In multi-agent reinforcement learning, multiple agents learn simultaneously while interacting with a common environment and each other.
We propose a novel textitTime Dynamical Opponent Model (TDOM) to encode the knowledge that the opponent policies tend to improve over time.
We show empirically that TDOM achieves superior opponent behavior prediction during test time.
arXiv Detail & Related papers (2022-04-12T07:16:15Z) - Explaining Reinforcement Learning Policies through Counterfactual
Trajectories [147.7246109100945]
A human developer must validate that an RL agent will perform well at test-time.
Our method conveys how the agent performs under distribution shifts by showing the agent's behavior across a wider trajectory distribution.
In a user study, we demonstrate that our method enables users to score better than baseline methods on one of two agent validation tasks.
arXiv Detail & Related papers (2022-01-29T00:52:37Z) - Multi-Agent Imitation Learning with Copulas [102.27052968901894]
Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions.
In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems.
Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
arXiv Detail & Related papers (2021-07-10T03:49:41Z) - Learning Latent Representations to Influence Multi-Agent Interaction [65.44092264843538]
We propose a reinforcement learning-based framework for learning latent representations of an agent's policy.
We show that our approach outperforms the alternatives and learns to influence the other agent.
arXiv Detail & Related papers (2020-11-12T19:04:26Z) - Agent Modelling under Partial Observability for Deep Reinforcement
Learning [12.903487594031276]
Existing methods for agent modelling assume knowledge of the local observations and chosen actions of the modelled agents during execution.
We learn to extract representations about the modelled agents conditioned only on the local observations of the controlled agent.
The representations are used to augment the controlled agent's decision policy which is trained via deep reinforcement learning.
arXiv Detail & Related papers (2020-06-16T18:43:42Z) - Variational Autoencoders for Opponent Modeling in Multi-Agent Systems [9.405879323049659]
Multi-agent systems exhibit complex behaviors that emanate from the interactions of multiple agents in a shared environment.
In this work, we are interested in controlling one agent in a multi-agent system and successfully learn to interact with the other agents that have fixed policies.
Modeling the behavior of other agents (opponents) is essential in understanding the interactions of the agents in the system.
arXiv Detail & Related papers (2020-01-29T13:38:59Z) - Multi-Agent Interactions Modeling with Correlated Policies [53.38338964628494]
In this paper, we cast the multi-agent interactions modeling problem into a multi-agent imitation learning framework.
We develop a Decentralized Adrial Imitation Learning algorithm with Correlated policies (CoDAIL)
Various experiments demonstrate that CoDAIL can better regenerate complex interactions close to the demonstrators.
arXiv Detail & Related papers (2020-01-04T17:31:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.