Private and Byzantine-Proof Cooperative Decision-Making
        - URL: http://arxiv.org/abs/2205.14174v1
- Date: Fri, 27 May 2022 18:03:54 GMT
- Title: Private and Byzantine-Proof Cooperative Decision-Making
- Authors: Abhimanyu Dubey and Alex Pentland
- Abstract summary: The cooperative bandit problem is a multi-agent decision problem involving a group of agents that interact simultaneously with a multi-armed bandit.
In this paper, we investigate the bandit problem under two settings - (a) when the agents wish to make their communication private with respect to the action sequence, and (b) when the agents can be byzantine.
We provide upper-confidence bound algorithms that obtain optimal regret while being (a) differentially-private and (b) private.
Our decentralized algorithms require no information about the network of connectivity between agents, making them scalable to large dynamic systems.
- Score: 15.609414012418043
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   The cooperative bandit problem is a multi-agent decision problem involving a
group of agents that interact simultaneously with a multi-armed bandit, while
communicating over a network with delays. The central idea in this problem is
to design algorithms that can efficiently leverage communication to obtain
improvements over acting in isolation. In this paper, we investigate the
stochastic bandit problem under two settings - (a) when the agents wish to make
their communication private with respect to the action sequence, and (b) when
the agents can be byzantine, i.e., they provide (stochastically) incorrect
information. For both these problem settings, we provide upper-confidence bound
algorithms that obtain optimal regret while being (a) differentially-private
and (b) tolerant to byzantine agents. Our decentralized algorithms require no
information about the network of connectivity between agents, making them
scalable to large dynamic systems. We test our algorithms on a competitive
benchmark of random graphs and demonstrate their superior performance with
respect to existing robust algorithms. We hope that our work serves as an
important step towards creating distributed decision-making systems that
maintain privacy.
 
      
        Related papers
        - Multi-Agent Best Arm Identification in Stochastic Linear Bandits [0.7673339435080443]
 We study the problem of collaborative best-arm identification in linear bandits under a fixed-budget scenario.<n>We propose two algorithms, MaLinBAI-Star and MaLinBAI-Gen for star networks and networks with arbitrary structure.
 arXiv  Detail & Related papers  (2024-11-20T20:09:44Z)
- Robust and Performance Incentivizing Algorithms for Multi-Armed Bandits   with Strategic Agents [52.75161794035767]
 We introduce a class of bandit algorithms that meet the two objectives of performance incentivization and robustness simultaneously.
We show that settings where the principal has no information about the arms' performance characteristics can be handled by combining ideas from second price auctions with our algorithms.
 arXiv  Detail & Related papers  (2023-12-13T06:54:49Z)
- Byzantine-Resilient Decentralized Multi-Armed Bandits [25.499420566469098]
 We develop an algorithm that fuses an information mixing step among agents with a truncation of inconsistent and extreme values.
This framework can be used to model attackers in computer networks, instigators of offensive content into recommender systems, or manipulators of financial markets.
 arXiv  Detail & Related papers  (2023-10-11T09:09:50Z)
- On the Complexity of Multi-Agent Decision Making: From Learning in Games
  to Partial Monitoring [105.13668993076801]
 A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees.
We study this question in a general framework for interactive decision making with multiple agents.
We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
 arXiv  Detail & Related papers  (2023-05-01T06:46:22Z)
- On-Demand Communication for Asynchronous Multi-Agent Bandits [43.3383282058292]
 ODC is an on-demand communication protocol that tailors the communication of each pair of agents based on their empirical pull times.
ODC is efficient when the pull times of agents are highly heterogeneous, and its communication complexity depends on the empirical pull times of agents.
 arXiv  Detail & Related papers  (2023-02-15T03:32:33Z)
- Byzantine-Robust Online and Offline Distributed Reinforcement Learning [60.970950468309056]
 We consider a distributed reinforcement learning setting where multiple agents explore the environment and communicate their experiences through a central server.
$alpha$-fraction of agents are adversarial and can report arbitrary fake information.
We seek to identify a near-optimal policy for the underlying Markov decision process in the presence of these adversarial agents.
 arXiv  Detail & Related papers  (2022-06-01T00:44:53Z)
- Secure Distributed/Federated Learning: Prediction-Privacy Trade-Off for
  Multi-Agent System [4.190359509901197]
 In the big data era, performing inference within the distributed and federated learning (DL and FL) frameworks, the central server needs to process a large amount of data.
Considering the decentralized computing topology, privacy has become a first-class concern.
We study the textitprivacy-aware server to multi-agent assignment problem subject to information processing constraints associated with each agent.
 arXiv  Detail & Related papers  (2022-04-24T19:19:20Z)
- Distributed Adaptive Learning Under Communication Constraints [54.22472738551687]
 This work examines adaptive distributed learning strategies designed to operate under communication constraints.
We consider a network of agents that must solve an online optimization problem from continual observation of streaming data.
 arXiv  Detail & Related papers  (2021-12-03T19:23:48Z)
- Emergence of Theory of Mind Collaboration in Multiagent Systems [65.97255691640561]
 We propose an adaptive training algorithm to develop effective collaboration between agents with ToM.
We evaluate our algorithms with two games, where our algorithm surpasses all previous decentralized execution algorithms without modeling ToM.
 arXiv  Detail & Related papers  (2021-09-30T23:28:00Z)
- Learning to Communicate and Correct Pose Errors [75.03747122616605]
 We study the setting proposed in V2VNet, where nearby self-driving vehicles jointly perform object detection and motion forecasting in a cooperative manner.
We propose a novel neural reasoning framework that learns to communicate, to estimate potential errors, and to reach a consensus about those errors.
 arXiv  Detail & Related papers  (2020-11-10T18:19:40Z)
- End-to-End Learning and Intervention in Games [60.41921763076017]
 We provide a unified framework for learning and intervention in games.
We propose two approaches, respectively based on explicit and implicit differentiation.
The analytical results are validated using several real-world problems.
 arXiv  Detail & Related papers  (2020-10-26T18:39:32Z)
- Cooperative Multi-Agent Bandits with Heavy Tails [15.609414012418043]
 We study the heavy-tailed bandit problem in the cooperative multi-agent setting, where a group of agents interact with a common bandit problem.
Existing algorithms for the bandit in this setting utilize confidence intervals arising from an averaging-based communication protocol.
We propose textscMP-UCB, a decentralized multi-agent algorithm for the cooperative bandit that incorporates robust estimation with a message-passing protocol.
 arXiv  Detail & Related papers  (2020-08-14T08:34:32Z)
- Kernel Methods for Cooperative Multi-Agent Contextual Bandits [15.609414012418043]
 Cooperative multi-agent decision making involves a group of agents cooperatively solving learning problems while communicating over a network with delays.
We consider the kernelised contextual bandit problem, where the reward obtained by an agent is an arbitrary linear function of the contexts' images in the related kernel reproducing Hilbert space (RKHS)
We propose textscCoop- KernelUCB, an algorithm that provides near-optimal bounds on the per-agent regret.
 arXiv  Detail & Related papers  (2020-08-14T07:37:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.