Related papers: Multi-agent active perception with prediction rewards

Multi-agent active perception with prediction rewards

URL: http://arxiv.org/abs/2010.11835v1
Date: Thu, 22 Oct 2020 16:10:15 GMT
Title: Multi-agent active perception with prediction rewards
Authors: Mikko Lauri and Frans A. Oliehoek
Abstract summary: Multi-agent active perception is a task where a team of agents cooperatively gathers observations to compute a joint estimate of a hidden variable. We model multi-agent active perception as a decentralized partially observable Markov decision process (Dec-POMDP) with a convex centralized prediction reward. Our results allow application of any Dec-POMDP solution algorithm to multi-agent active perception problems, and enable planning to reduce uncertainty without explicit computation of joint estimates.
Score: 18.780904566592852
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-agent active perception is a task where a team of agents cooperatively gathers observations to compute a joint estimate of a hidden variable. The task is decentralized and the joint estimate can only be computed after the task ends by fusing observations of all agents. The objective is to maximize the accuracy of the estimate. The accuracy is quantified by a centralized prediction reward determined by a centralized decision-maker who perceives the observations gathered by all agents after the task ends. In this paper, we model multi-agent active perception as a decentralized partially observable Markov decision process (Dec-POMDP) with a convex centralized prediction reward. We prove that by introducing individual prediction actions for each agent, the problem is converted into a standard Dec-POMDP with a decentralized prediction reward. The loss due to decentralization is bounded, and we give a sufficient condition for when it is zero. Our results allow application of any Dec-POMDP solution algorithm to multi-agent active perception problems, and enable planning to reduce uncertainty without explicit computation of joint estimates. We demonstrate the empirical usefulness of our results by applying a standard Dec-POMDP algorithm to multi-agent active perception problems, showing increased scalability in the planning horizon.

Related papers

Wisdom of Crowds Through Myopic Self-Confidence Adaptation [1.1060425537315088]
The wisdom of crowds is an umbrella term for phenomena suggesting that the collective judgment or decision of a large group can be more accurate than the individual judgments or decisions of the group members.<n>In this paper, we consider a group of agents who initially possess uncorrelated and unbiased measurements of a common state of the world.<n>Assume these agents iteratively update their estimates according to a simple non-Bayesian learning rule, commonly known in sociology as the French-DeGroot dynamics or iterative opinion pooling.
arXiv Detail & Related papers (2025-06-22T22:55:17Z)
Scalable Decentralized Algorithms for Online Personalized Mean Estimation [12.002609934938224]
This study focuses on a simplified version of the overarching problem, where each agent collects samples from a real-valued distribution over time to estimate its mean. We introduce two collaborative mean estimation algorithms: one draws inspiration from belief propagation, while the other employs a consensus-based approach.
arXiv Detail & Related papers (2024-02-20T08:30:46Z)
Decentralised Q-Learning for Multi-Agent Markov Decision Processes with a Satisfiability Criterion [0.0]
We propose a reinforcement learning algorithm to solve a multi-agent Markov decision process (MMDP) The goal is to lower the time average cost of each agent to below a pre-specified agent-specific bound.
arXiv Detail & Related papers (2023-11-21T13:56:44Z)
Online Decision Mediation [72.80902932543474]
Consider learning a decision support assistant to serve as an intermediary between (oracle) expert behavior and (imperfect) human behavior. In clinical diagnosis, fully-autonomous machine behavior is often beyond ethical affordances.
arXiv Detail & Related papers (2023-10-28T05:59:43Z)
On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees. We study this question in a general framework for interactive decision making with multiple agents. We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z)
PAC: Assisted Value Factorisation with Counterfactual Predictions in Multi-Agent Reinforcement Learning [43.862956745961654]
Multi-agent reinforcement learning (MARL) has witnessed significant progress with the development of value function factorization methods. In this paper, we show that in partially observable MARL problems, an agent's ordering over its own actions could impose concurrent constraints. We propose PAC, a new framework leveraging information generated from Counterfactual Predictions of optimal joint action selection.
arXiv Detail & Related papers (2022-06-22T23:34:30Z)
Emergence of Theory of Mind Collaboration in Multiagent Systems [65.97255691640561]
We propose an adaptive training algorithm to develop effective collaboration between agents with ToM. We evaluate our algorithms with two games, where our algorithm surpasses all previous decentralized execution algorithms without modeling ToM.
arXiv Detail & Related papers (2021-09-30T23:28:00Z)
Test-time Collective Prediction [73.74982509510961]
Multiple parties in machine learning want to jointly make predictions on future test points. Agents wish to benefit from the collective expertise of the full set of agents, but may not be willing to release their data or model parameters. We explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model.
arXiv Detail & Related papers (2021-06-22T18:29:58Z)
Lightweight Distributed Gaussian Process Regression for Online Machine Learning [2.0305676256390934]
Group of agents aim to collaboratively learn a common static latent function through streaming data. We propose a lightweight distributed Gaussian process regression (GPR) algorithm that is cognizant of agents' limited capabilities in communication, computation and memory.
arXiv Detail & Related papers (2021-05-11T01:13:22Z)
Exploiting Submodular Value Functions For Scaling Up Active Perception [60.81276437097671]
In active perception tasks, agent aims to select sensory actions that reduce uncertainty about one or more hidden variables. Partially observable Markov decision processes (POMDPs) provide a natural model for such problems. As the number of sensors available to the agent grows, the computational cost of POMDP planning grows exponentially.
arXiv Detail & Related papers (2020-09-21T09:11:36Z)
Maximizing Information Gain in Partially Observable Environments via Prediction Reward [64.24528565312463]
This paper tackles the challenge of using belief-based rewards for a deep RL agent. We derive the exact error between negative entropy and the expected prediction reward. This insight provides theoretical motivation for several fields using prediction rewards.
arXiv Detail & Related papers (2020-05-11T08:13:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.