The Frost Hollow Experiments: Pavlovian Signalling as a Path to
Coordination and Communication Between Agents
- URL: http://arxiv.org/abs/2203.09498v1
- Date: Thu, 17 Mar 2022 17:49:45 GMT
- Title: The Frost Hollow Experiments: Pavlovian Signalling as a Path to
Coordination and Communication Between Agents
- Authors: Patrick M. Pilarski, Andrew Butcher, Elnaz Davoodi, Michael Bradley
Johanson, Dylan J. A. Brenneis, Adam S. R. Parker, Leslie Acker, Matthew M.
Botvinick, Joseph Modayil, Adam White
- Abstract summary: This paper contributes a multi-faceted study into what we term Pavlovian signalling.
We establish Pavlovian signalling as a natural bridge between fixed signalling paradigms and fully adaptive communication learning.
Our results point to an actionable, constructivist path towards continual communication learning between reinforcement learning agents.
- Score: 7.980685978549764
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learned communication between agents is a powerful tool when approaching
decision-making problems that are hard to overcome by any single agent in
isolation. However, continual coordination and communication learning between
machine agents or human-machine partnerships remains a challenging open
problem. As a stepping stone toward solving the continual communication
learning problem, in this paper we contribute a multi-faceted study into what
we term Pavlovian signalling -- a process by which learned, temporally extended
predictions made by one agent inform decision-making by another agent with
different perceptual access to their shared environment. We seek to establish
how different temporal processes and representational choices impact Pavlovian
signalling between learning agents. To do so, we introduce a partially
observable decision-making domain we call the Frost Hollow. In this domain a
prediction learning agent and a reinforcement learning agent are coupled into a
two-part decision-making system that seeks to acquire sparse reward while
avoiding time-conditional hazards. We evaluate two domain variations: 1)
machine prediction and control learning in a linear walk, and 2) a prediction
learning machine interacting with a human participant in a virtual reality
environment. Our results showcase the speed of learning for Pavlovian
signalling, the impact that different temporal representations do (and do not)
have on agent-agent coordination, and how temporal aliasing impacts agent-agent
and human-agent interactions differently. As a main contribution, we establish
Pavlovian signalling as a natural bridge between fixed signalling paradigms and
fully adaptive communication learning. Our results therefore point to an
actionable, constructivist path towards continual communication learning
between reinforcement learning agents, with potential impact in a range of
real-world settings.
Related papers
- Multi-agent cooperation through learning-aware policy gradients [53.63948041506278]
Self-interested individuals often fail to cooperate, posing a fundamental challenge for multi-agent learning.
We present the first unbiased, higher-derivative-free policy gradient algorithm for learning-aware reinforcement learning.
We derive from the iterated prisoner's dilemma a novel explanation for how and when cooperation arises among self-interested learning-aware agents.
arXiv Detail & Related papers (2024-10-24T10:48:42Z) - Interactive Autonomous Navigation with Internal State Inference and
Interactivity Estimation [58.21683603243387]
We propose three auxiliary tasks with relational-temporal reasoning and integrate them into the standard Deep Learning framework.
These auxiliary tasks provide additional supervision signals to infer the behavior patterns other interactive agents.
Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics.
arXiv Detail & Related papers (2023-11-27T18:57:42Z) - Neural Amortized Inference for Nested Multi-agent Reasoning [54.39127942041582]
We propose a novel approach to bridge the gap between human-like inference capabilities and computational limitations.
We evaluate our method in two challenging multi-agent interaction domains.
arXiv Detail & Related papers (2023-08-21T22:40:36Z) - Pavlovian Signalling with General Value Functions in Agent-Agent
Temporal Decision Making [6.704848594973921]
We study Pavlovian signalling -- a process by which learned, temporally extended predictions made by one agent inform decision-making by another agent.
As a main contribution, we establish Pavlovian signalling as a natural bridge between fixed signalling paradigms and fully adaptive communication learning between two agents.
arXiv Detail & Related papers (2022-01-11T00:14:04Z) - Assessing Human Interaction in Virtual Reality With Continually Learning
Prediction Agents Based on Reinforcement Learning Algorithms: A Pilot Study [6.076137037890219]
We investigate how the interaction between a human and a continually learning prediction agent develops as the agent develops competency.
We develop a virtual reality environment and a time-based prediction task wherein learned predictions from a reinforcement learning (RL) algorithm augment human predictions.
Our findings suggest that human trust of the system may be influenced by early interactions with the agent, and that trust in turn affects strategic behaviour.
arXiv Detail & Related papers (2021-12-14T22:46:44Z) - Interpretation of Emergent Communication in Heterogeneous Collaborative
Embodied Agents [83.52684405389445]
We introduce the collaborative multi-object navigation task CoMON.
In this task, an oracle agent has detailed environment information in the form of a map.
It communicates with a navigator agent that perceives the environment visually and is tasked to find a sequence of goals.
We show that the emergent communication can be grounded to the agent observations and the spatial structure of the 3D environment.
arXiv Detail & Related papers (2021-10-12T06:56:11Z) - Learning Proxemic Behavior Using Reinforcement Learning with Cognitive
Agents [1.0635883951034306]
Proxemics is a branch of non-verbal communication concerned with studying the spatial behavior of people and animals.
We study how agents behave in environments based on proxemic behavior.
arXiv Detail & Related papers (2021-08-08T20:45:34Z) - Learning to Communicate and Correct Pose Errors [75.03747122616605]
We study the setting proposed in V2VNet, where nearby self-driving vehicles jointly perform object detection and motion forecasting in a cooperative manner.
We propose a novel neural reasoning framework that learns to communicate, to estimate potential errors, and to reach a consensus about those errors.
arXiv Detail & Related papers (2020-11-10T18:19:40Z) - End-to-End Learning and Intervention in Games [60.41921763076017]
We provide a unified framework for learning and intervention in games.
We propose two approaches, respectively based on explicit and implicit differentiation.
The analytical results are validated using several real-world problems.
arXiv Detail & Related papers (2020-10-26T18:39:32Z) - The Emergence of Adversarial Communication in Multi-Agent Reinforcement
Learning [6.18778092044887]
Many real-world problems require the coordination of multiple autonomous agents.
Recent work has shown the promise of Graph Neural Networks (GNNs) to learn explicit communication strategies that enable complex multi-agent coordination.
We show how a single self-interested agent is capable of learning highly manipulative communication strategies that allows it to significantly outperform a cooperative team of agents.
arXiv Detail & Related papers (2020-08-06T12:48:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.