Sharing the Cost of Success: A Game for Evaluating and Learning Collaborative Multi-Agent Instruction Giving and Following Policies
- URL: http://arxiv.org/abs/2403.17497v1
- Date: Tue, 26 Mar 2024 08:58:28 GMT
- Title: Sharing the Cost of Success: A Game for Evaluating and Learning Collaborative Multi-Agent Instruction Giving and Following Policies
- Authors: Philipp Sadler, Sherzod Hakimov, David Schlangen,
- Abstract summary: We propose a challenging interactive reference game that requires two players to coordinate on vision and language observations.
We show that a standard Proximal Policy Optimization (PPO) setup achieves a high success rate when bootstrapped with partner behaviors.
We find that a pairing of neural partners indeed reduces the measured joint effort when playing together repeatedly.
- Score: 19.82683688911297
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In collaborative goal-oriented settings, the participants are not only interested in achieving a successful outcome, but do also implicitly negotiate the effort they put into the interaction (by adapting to each other). In this work, we propose a challenging interactive reference game that requires two players to coordinate on vision and language observations. The learning signal in this game is a score (given after playing) that takes into account the achieved goal and the players' assumed efforts during the interaction. We show that a standard Proximal Policy Optimization (PPO) setup achieves a high success rate when bootstrapped with heuristic partner behaviors that implement insights from the analysis of human-human interactions. And we find that a pairing of neural partners indeed reduces the measured joint effort when playing together repeatedly. However, we observe that in comparison to a reasonable heuristic pairing there is still room for improvement -- which invites further research in the direction of cost-sharing in collaborative interactions.
Related papers
- A Dialogue Game for Eliciting Balanced Collaboration [64.61707514432533]
We present a two-player 2D object placement game in which the players must negotiate the goal state themselves.
We show empirically that human players exhibit a variety of role distributions, and that balanced collaboration improves task performance.
arXiv Detail & Related papers (2024-06-12T13:35:10Z) - GOMA: Proactive Embodied Cooperative Communication via Goal-Oriented Mental Alignment [72.96949760114575]
We propose a novel cooperative communication framework, Goal-Oriented Mental Alignment (GOMA)
GOMA formulates verbal communication as a planning problem that minimizes the misalignment between parts of agents' mental states that are relevant to the goals.
We evaluate our approach against strong baselines in two challenging environments, Overcooked (a multiplayer game) and VirtualHome (a household simulator)
arXiv Detail & Related papers (2024-03-17T03:52:52Z) - Aligning Individual and Collective Objectives in Multi-Agent Cooperation [18.082268221987956]
Mixed-motive cooperation is one of the most prominent challenges in multi-agent learning.
We introduce a novel optimization method named textbftextitAltruistic textbftextitGradient textbftextitAdjustment (textbftextitAgA) that employs gradient adjustments to progressively align individual and collective objectives.
We evaluate the effectiveness of our algorithm AgA through benchmark environments for testing mixed-motive collaboration with small-scale agents.
arXiv Detail & Related papers (2024-02-19T08:18:53Z) - The Machine Psychology of Cooperation: Can GPT models operationalise prompts for altruism, cooperation, competitiveness and selfishness in economic games? [0.0]
We investigated the capability of the GPT-3.5 large language model (LLM) to operationalize natural language descriptions of cooperative, competitive, altruistic, and self-interested behavior.
We used a prompt to describe the task environment using a similar protocol to that used in experimental psychology studies with human subjects.
Our results provide evidence that LLMs can, to some extent, translate natural language descriptions of different cooperative stances into corresponding descriptions of appropriate task behaviour.
arXiv Detail & Related papers (2023-05-13T17:23:16Z) - Incorporating Rivalry in Reinforcement Learning for a Competitive Game [65.2200847818153]
This work proposes a novel reinforcement learning mechanism based on the social impact of rivalry behavior.
Our proposed model aggregates objective and social perception mechanisms to derive a rivalry score that is used to modulate the learning of artificial agents.
arXiv Detail & Related papers (2022-08-22T14:06:06Z) - Collusion Detection in Team-Based Multiplayer Games [57.153233321515984]
We propose a system that detects colluding behaviors in team-based multiplayer games.
The proposed method analyzes the players' social relationships paired with their in-game behavioral patterns.
We then automate the detection using Isolation Forest, an unsupervised learning technique specialized in highlighting outliers.
arXiv Detail & Related papers (2022-03-10T02:37:39Z) - Incorporating Rivalry in Reinforcement Learning for a Competitive Game [65.2200847818153]
This study focuses on providing a novel learning mechanism based on a rivalry social impact.
Based on the concept of competitive rivalry, our analysis aims to investigate if we can change the assessment of these agents from a human perspective.
arXiv Detail & Related papers (2020-11-02T21:54:18Z) - On Emergent Communication in Competitive Multi-Agent Teams [116.95067289206919]
We investigate whether competition for performance from an external, similar agent team could act as a social influence.
Our results show that an external competitive influence leads to improved accuracy and generalization, as well as faster emergence of communicative languages.
arXiv Detail & Related papers (2020-03-04T01:14:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.