ICQ: A Quantization Scheme for Best-Arm Identification Over
Bit-Constrained Channels
- URL: http://arxiv.org/abs/2305.00528v1
- Date: Sun, 30 Apr 2023 17:00:03 GMT
- Title: ICQ: A Quantization Scheme for Best-Arm Identification Over
Bit-Constrained Channels
- Authors: Fathima Zarin Faizal, Adway Girish, Manjesh Kumar Hanawal, Nikhil
Karamchandani
- Abstract summary: We study the problem of best-arm identification in a distributed variant of the multi-armed bandit setting.
We propose a novel quantization scheme called Inflating Confidence for Quantization (ICQ) that can be applied to existing confidence-bound based learning algorithms.
- Score: 9.173160301214805
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the problem of best-arm identification in a distributed variant of
the multi-armed bandit setting, with a central learner and multiple agents.
Each agent is associated with an arm of the bandit, generating stochastic
rewards following an unknown distribution. Further, each agent can communicate
the observed rewards with the learner over a bit-constrained channel. We
propose a novel quantization scheme called Inflating Confidence for
Quantization (ICQ) that can be applied to existing confidence-bound based
learning algorithms such as Successive Elimination. We analyze the performance
of ICQ applied to Successive Elimination and show that the overall algorithm,
named ICQ-SE, has the order-optimal sample complexity as that of the
(unquantized) SE algorithm. Moreover, it requires only an exponentially sparse
frequency of communication between the learner and the agents, thus requiring
considerably fewer bits than existing quantization schemes to successfully
identify the best arm. We validate the performance improvement offered by ICQ
with other quantization methods through numerical experiments.
Related papers
- Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z) - Decentralised Q-Learning for Multi-Agent Markov Decision Processes with
a Satisfiability Criterion [0.0]
We propose a reinforcement learning algorithm to solve a multi-agent Markov decision process (MMDP)
The goal is to lower the time average cost of each agent to below a pre-specified agent-specific bound.
arXiv Detail & Related papers (2023-11-21T13:56:44Z) - High-rate discretely-modulated continuous-variable quantum key
distribution using quantum machine learning [4.236937886028215]
We propose a high-rate scheme for discretely-modulated continuous-variable quantum key distribution (DM CVQKD) using quantum machine learning technologies.
A low-complexity quantum k-nearest neighbor (QkNN) is designed for predicting the lossy discretely-modulated coherent states (DMCSs) at Bob's side.
Numerical simulation shows that the secret key rate of our proposed scheme is explicitly superior to the existing DM CVQKD protocols.
arXiv Detail & Related papers (2023-08-07T04:00:13Z) - Q-SHED: Distributed Optimization at the Edge via Hessian Eigenvectors
Quantization [5.404315085380945]
Newton-type (NT) methods have been advocated as enablers of robust convergence rates in DO problems.
We propose Q-SHED, an original NT algorithm for DO featuring a novel bit-allocation scheme based on incremental Hessian eigenvectors quantization.
We show that Q-SHED can reduce by up to 60% the number of communication rounds required for convergence.
arXiv Detail & Related papers (2023-05-18T10:15:03Z) - On the Complexity of Multi-Agent Decision Making: From Learning in Games
to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees.
We study this question in a general framework for interactive decision making with multiple agents.
We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z) - Quantization-aware Interval Bound Propagation for Training Certifiably
Robust Quantized Neural Networks [58.195261590442406]
We study the problem of training and certifying adversarially robust quantized neural networks (QNNs)
Recent work has shown that floating-point neural networks that have been verified to be robust can become vulnerable to adversarial attacks after quantization.
We present quantization-aware interval bound propagation (QA-IBP), a novel method for training robust QNNs.
arXiv Detail & Related papers (2022-11-29T13:32:38Z) - Task-Oriented Sensing, Computation, and Communication Integration for
Multi-Device Edge AI [108.08079323459822]
This paper studies a new multi-intelligent edge artificial-latency (AI) system, which jointly exploits the AI model split inference and integrated sensing and communication (ISAC)
We measure the inference accuracy by adopting an approximate but tractable metric, namely discriminant gain.
arXiv Detail & Related papers (2022-07-03T06:57:07Z) - An Efficient Algorithm for Cooperative Semi-Bandits [0.0]
We introduce Coop-FTPL, a cooperative version of the well-known Follow The Perturbed Leader algorithm.
We show that the expected regret of our algorithm after T time steps is of order QT log(k)(k$alpha$ 1 /Q + m), where Q is the total activation probability mass.
arXiv Detail & Related papers (2020-10-05T07:08:26Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms.
SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z) - Task-Based Information Compression for Multi-Agent Communication
Problems with Channel Rate Constraints [28.727611928919725]
We introduce the state-aggregation for information compression algorithm (SAIC) to solve the formulated TBIC problem.
It is shown that SAIC is able to achieve near-optimal performance in terms of the achieved sum of discounted rewards.
arXiv Detail & Related papers (2020-05-28T18:29:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.