Federated Neural Bandit
- URL: http://arxiv.org/abs/2205.14309v1
- Date: Sat, 28 May 2022 02:58:37 GMT
- Title: Federated Neural Bandit
- Authors: Zhongxiang Dai, Yao Shu, Arun Verma, Flint Xiaofeng Fan, Bryan Kian
Hsiang Low, Patrick Jaillet
- Abstract summary: In this paper, we introduce the federated neural-upper confidence bound (FN-UCB) algorithm for neural contextual bandit.
To better exploit the federated setting, we adopt a weighted combination of two UCBs: $textUCBa$ allows every agent to additionally use the observations from the other agents to accelerate exploration.
We prove sub-linear upper bounds on both the cumulative regret and the number of communication rounds of FN-UCB, and use empirical experiments to demonstrate its competitive performances.
- Score: 46.64825970508973
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent works on neural contextual bandit have achieved compelling
performances thanks to their ability to leverage the strong representation
power of neural networks (NNs) for reward prediction. Many applications of
contextual bandit involve multiple agents who collaborate without sharing raw
observations, giving rise to the setting of federated contextual bandit.
Existing works on federated contextual bandit rely on linear or kernelized
bandit, which may fall short when modeling complicated real-world reward
functions. In this regard, we introduce the federated neural-upper confidence
bound (FN-UCB) algorithm. To better exploit the federated setting, we adopt a
weighted combination of two UCBs: $\text{UCB}^{a}$ allows every agent to
additionally use the observations from the other agents to accelerate
exploration (without sharing raw observations); $\text{UCB}^{b}$ uses an NN
with aggregated parameters for reward prediction in a similar way as federated
averaging for supervised learning. Notably, the weight between the two UCBs
required by our theoretical analysis is amenable to an interesting
interpretation, which emphasizes $\text{UCB}^{a}$ initially for accelerated
exploration and relies more on $\text{UCB}^{b}$ later after enough observations
have been collected to train the NNs for accurate reward prediction (i.e.,
reliable exploitation). We prove sub-linear upper bounds on both the cumulative
regret and the number of communication rounds of FN-UCB, and use empirical
experiments to demonstrate its competitive performances.
Related papers
- Neural Combinatorial Clustered Bandits for Recommendation Systems [12.800116749927266]
We use deep neural networks to estimate unknown reward functions.
Unlike prior neural bandit works, NeUClust uses a neural network to estimate the super arm reward and select the super arm.
NeUClust achieves better regret and reward than other contextual matrix and neural bandit algorithms.
arXiv Detail & Related papers (2024-10-18T16:37:28Z) - Neural Exploitation and Exploration of Contextual Bandits [51.25537742455235]
We study utilizing neural networks for the exploitation and exploration of contextual multi-armed bandits.
EE-Net is a novel neural-based exploitation and exploration strategy.
We show that EE-Net outperforms related linear and neural contextual bandit baselines on real-world datasets.
arXiv Detail & Related papers (2023-05-05T18:34:49Z) - Explicit Tradeoffs between Adversarial and Natural Distributional
Robustness [48.44639585732391]
In practice, models need to enjoy both types of robustness to ensure reliability.
In this work, we show that in fact, explicit tradeoffs exist between adversarial and natural distributional robustness.
arXiv Detail & Related papers (2022-09-15T19:58:01Z) - Deep Upper Confidence Bound Algorithm for Contextual Bandit Ranking of
Information Selection [0.0]
Contextual multi-armed bandits (CMAB) have been widely used for learning to filter and prioritize information according to a user's interest.
In this work, we analyze top-K ranking under the CMAB framework where the top-K arms are chosen iteratively to maximize a reward.
We introduce a novel algorithm called the Deep Upper Confidence Bound (UCB) algorithm.
arXiv Detail & Related papers (2021-10-08T13:32:14Z) - EE-Net: Exploitation-Exploration Neural Networks in Contextual Bandits [52.98326168071513]
"EE-Net" is a neural-based bandit approach with a novel exploration strategy.
We show that EE-Net achieves $mathcalO(sqrtTlog T)$ regret, which is tighter than existing state-of-the-art neural bandit algorithms.
arXiv Detail & Related papers (2021-10-07T04:12:36Z) - Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss.
We examine how these benign overfitting phenomena occur in a two-layer neural network setting.
We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z) - Neural Contextual Bandits with Deep Representation and Shallow
Exploration [105.8099566651448]
We propose a novel learning algorithm that transforms the raw feature vector using the last hidden layer of a deep ReLU neural network.
Compared with existing neural contextual bandit algorithms, our approach is computationally much more efficient since it only needs to explore in the last layer of the deep neural network.
arXiv Detail & Related papers (2020-12-03T09:17:55Z) - The Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many Arms [10.662105162882526]
We investigate a $k$-armed bandit problem in the emphmany-armed regime.
Our findings suggest a new form of emphfree exploration beneficial to greedy algorithms in the many-armed context.
arXiv Detail & Related papers (2020-02-24T08:59:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.