Targeted Data Acquisition for Evolving Negotiation Agents
- URL: http://arxiv.org/abs/2106.07728v2
- Date: Wed, 16 Jun 2021 17:49:13 GMT
- Title: Targeted Data Acquisition for Evolving Negotiation Agents
- Authors: Minae Kwon, Siddharth Karamcheti, Mariano-Florentino Cuellar, Dorsa
Sadigh
- Abstract summary: Successful negotiators must learn how to balance optimizing for self-interest and cooperation.
Current artificial negotiation agents often heavily depend on the quality of the static datasets they were trained on.
We introduce a targeted data acquisition framework where we guide the exploration of a reinforcement learning agent.
- Score: 6.953246373478702
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Successful negotiators must learn how to balance optimizing for self-interest
and cooperation. Yet current artificial negotiation agents often heavily depend
on the quality of the static datasets they were trained on, limiting their
capacity to fashion an adaptive response balancing self-interest and
cooperation. For this reason, we find that these agents can achieve either high
utility or cooperation, but not both. To address this, we introduce a targeted
data acquisition framework where we guide the exploration of a reinforcement
learning agent using annotations from an expert oracle. The guided exploration
incentivizes the learning agent to go beyond its static dataset and develop new
negotiation strategies. We show that this enables our agents to obtain
higher-reward and more Pareto-optimal solutions when negotiating with both
simulated and human partners compared to standard supervised learning and
reinforcement learning methods. This trend additionally holds when comparing
agents using our targeted data acquisition framework to variants of agents
trained with a mix of supervised learning and reinforcement learning, or to
agents using tailored reward functions that explicitly optimize for utility and
Pareto-optimality.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Communication Learning in Multi-Agent Systems from Graph Modeling Perspective [62.13508281188895]
We introduce a novel approach wherein we conceptualize the communication architecture among agents as a learnable graph.
We introduce a temporal gating mechanism for each agent, enabling dynamic decisions on whether to receive shared information at a given time.
arXiv Detail & Related papers (2024-11-01T05:56:51Z) - Decentralized and Lifelong-Adaptive Multi-Agent Collaborative Learning [57.652899266553035]
Decentralized and lifelong-adaptive multi-agent collaborative learning aims to enhance collaboration among multiple agents without a central server.
We propose DeLAMA, a decentralized multi-agent lifelong collaborative learning algorithm with dynamic collaboration graphs.
arXiv Detail & Related papers (2024-03-11T09:21:11Z) - Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents [49.85633804913796]
We present an exploration-based trajectory optimization approach, referred to as ETO.
This learning method is designed to enhance the performance of open LLM agents.
Our experiments on three complex tasks demonstrate that ETO consistently surpasses baseline performance by a large margin.
arXiv Detail & Related papers (2024-03-04T21:50:29Z) - Contrastive learning-based agent modeling for deep reinforcement
learning [31.293496061727932]
Agent modeling is essential when designing adaptive policies for intelligent machine agents in multiagent systems.
We devised a Contrastive Learning-based Agent Modeling (CLAM) method that relies only on the local observations from the ego agent during training and execution.
CLAM is capable of generating consistent high-quality policy representations in real-time right from the beginning of each episode.
arXiv Detail & Related papers (2023-12-30T03:44:12Z) - ProAgent: Building Proactive Cooperative Agents with Large Language
Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents.
ProAgent can analyze the present state, and infer the intentions of teammates from observations.
ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z) - Expert-Free Online Transfer Learning in Multi-Agent Reinforcement
Learning [2.984934409689467]
Expert-Free Online Transfer Learning (EF-OnTL) is an algorithm that enables expert-free real-time dynamic transfer learning in multi-agent system.
EF-OnTL achieves overall comparable performance when compared against advice-based baselines.
arXiv Detail & Related papers (2023-03-02T11:21:03Z) - Behaviour-conditioned policies for cooperative reinforcement learning
tasks [41.74498230885008]
In various real-world tasks, an agent needs to cooperate with unknown partner agent types.
Deep reinforcement learning models can be trained to deliver the required functionality but are known to suffer from sample inefficiency and slow learning.
We suggest a method, where we synthetically produce populations of agents with different behavioural patterns together with ground truth data of their behaviour.
We additionally suggest an agent architecture, which can efficiently use the generated data and gain the meta-learning capability.
arXiv Detail & Related papers (2021-10-04T09:16:41Z) - PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning.
We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z) - Agent-Centric Representations for Multi-Agent Reinforcement Learning [12.577354830985012]
We investigate whether object-centric representations are also beneficial in the fully cooperative multi-agent reinforcement learning setting.
Specifically, we study two ways of incorporating an agent-centric inductive bias into our RL algorithm.
We evaluate these approaches on the Google Research Football environment as well as DeepMind Lab 2D.
arXiv Detail & Related papers (2021-04-19T15:43:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.