Related papers: The Frost Hollow Experiments: Pavlovian Signalling as a Path to Coordination and Communication Between Agents

The Frost Hollow Experiments: Pavlovian Signalling as a Path to Coordination and Communication Between Agents

URL: http://arxiv.org/abs/2203.09498v1
Date: Thu, 17 Mar 2022 17:49:45 GMT
Title: The Frost Hollow Experiments: Pavlovian Signalling as a Path to Coordination and Communication Between Agents
Authors: Patrick M. Pilarski, Andrew Butcher, Elnaz Davoodi, Michael Bradley Johanson, Dylan J. A. Brenneis, Adam S. R. Parker, Leslie Acker, Matthew M. Botvinick, Joseph Modayil, Adam White
Abstract summary: This paper contributes a multi-faceted study into what we term Pavlovian signalling. We establish Pavlovian signalling as a natural bridge between fixed signalling paradigms and fully adaptive communication learning. Our results point to an actionable, constructivist path towards continual communication learning between reinforcement learning agents.
Score: 7.980685978549764
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learned communication between agents is a powerful tool when approaching decision-making problems that are hard to overcome by any single agent in isolation. However, continual coordination and communication learning between machine agents or human-machine partnerships remains a challenging open problem. As a stepping stone toward solving the continual communication learning problem, in this paper we contribute a multi-faceted study into what we term Pavlovian signalling -- a process by which learned, temporally extended predictions made by one agent inform decision-making by another agent with different perceptual access to their shared environment. We seek to establish how different temporal processes and representational choices impact Pavlovian signalling between learning agents. To do so, we introduce a partially observable decision-making domain we call the Frost Hollow. In this domain a prediction learning agent and a reinforcement learning agent are coupled into a two-part decision-making system that seeks to acquire sparse reward while avoiding time-conditional hazards. We evaluate two domain variations: 1) machine prediction and control learning in a linear walk, and 2) a prediction learning machine interacting with a human participant in a virtual reality environment. Our results showcase the speed of learning for Pavlovian signalling, the impact that different temporal representations do (and do not) have on agent-agent coordination, and how temporal aliasing impacts agent-agent and human-agent interactions differently. As a main contribution, we establish Pavlovian signalling as a natural bridge between fixed signalling paradigms and fully adaptive communication learning. Our results therefore point to an actionable, constructivist path towards continual communication learning between reinforcement learning agents, with potential impact in a range of real-world settings.

Related papers

Multi-agent cooperation through learning-aware policy gradients [53.63948041506278]
Self-interested individuals often fail to cooperate, posing a fundamental challenge for multi-agent learning. We present the first unbiased, higher-derivative-free policy gradient algorithm for learning-aware reinforcement learning. We derive from the iterated prisoner's dilemma a novel explanation for how and when cooperation arises among self-interested learning-aware agents.
arXiv Detail & Related papers (2024-10-24T10:48:42Z)
Interactive Autonomous Navigation with Internal State Inference and Interactivity Estimation [58.21683603243387]
We propose three auxiliary tasks with relational-temporal reasoning and integrate them into the standard Deep Learning framework. These auxiliary tasks provide additional supervision signals to infer the behavior patterns other interactive agents. Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics.
arXiv Detail & Related papers (2023-11-27T18:57:42Z)
Neural Amortized Inference for Nested Multi-agent Reasoning [54.39127942041582]
We propose a novel approach to bridge the gap between human-like inference capabilities and computational limitations. We evaluate our method in two challenging multi-agent interaction domains.
arXiv Detail & Related papers (2023-08-21T22:40:36Z)
Pavlovian Signalling with General Value Functions in Agent-Agent Temporal Decision Making [6.704848594973921]
We study Pavlovian signalling -- a process by which learned, temporally extended predictions made by one agent inform decision-making by another agent. As a main contribution, we establish Pavlovian signalling as a natural bridge between fixed signalling paradigms and fully adaptive communication learning between two agents.
arXiv Detail & Related papers (2022-01-11T00:14:04Z)
Assessing Human Interaction in Virtual Reality With Continually Learning Prediction Agents Based on Reinforcement Learning Algorithms: A Pilot Study [6.076137037890219]
We investigate how the interaction between a human and a continually learning prediction agent develops as the agent develops competency. We develop a virtual reality environment and a time-based prediction task wherein learned predictions from a reinforcement learning (RL) algorithm augment human predictions. Our findings suggest that human trust of the system may be influenced by early interactions with the agent, and that trust in turn affects strategic behaviour.
arXiv Detail & Related papers (2021-12-14T22:46:44Z)
Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents [83.52684405389445]
We introduce the collaborative multi-object navigation task CoMON. In this task, an oracle agent has detailed environment information in the form of a map. It communicates with a navigator agent that perceives the environment visually and is tasked to find a sequence of goals. We show that the emergent communication can be grounded to the agent observations and the spatial structure of the 3D environment.
arXiv Detail & Related papers (2021-10-12T06:56:11Z)
Learning Proxemic Behavior Using Reinforcement Learning with Cognitive Agents [1.0635883951034306]
Proxemics is a branch of non-verbal communication concerned with studying the spatial behavior of people and animals. We study how agents behave in environments based on proxemic behavior.
arXiv Detail & Related papers (2021-08-08T20:45:34Z)
Learning to Communicate and Correct Pose Errors [75.03747122616605]
We study the setting proposed in V2VNet, where nearby self-driving vehicles jointly perform object detection and motion forecasting in a cooperative manner. We propose a novel neural reasoning framework that learns to communicate, to estimate potential errors, and to reach a consensus about those errors.
arXiv Detail & Related papers (2020-11-10T18:19:40Z)
End-to-End Learning and Intervention in Games [60.41921763076017]
We provide a unified framework for learning and intervention in games. We propose two approaches, respectively based on explicit and implicit differentiation. The analytical results are validated using several real-world problems.
arXiv Detail & Related papers (2020-10-26T18:39:32Z)
The Emergence of Adversarial Communication in Multi-Agent Reinforcement Learning [6.18778092044887]
Many real-world problems require the coordination of multiple autonomous agents. Recent work has shown the promise of Graph Neural Networks (GNNs) to learn explicit communication strategies that enable complex multi-agent coordination. We show how a single self-interested agent is capable of learning highly manipulative communication strategies that allows it to significantly outperform a cooperative team of agents.
arXiv Detail & Related papers (2020-08-06T12:48:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.