Federated Online Sparse Decision Making
- URL: http://arxiv.org/abs/2202.13448v1
- Date: Sun, 27 Feb 2022 20:34:41 GMT
- Title: Federated Online Sparse Decision Making
- Authors: Chi-Hua Wang, Wenjie Li, Guang Cheng, and Guang Lin
- Abstract summary: textttFedego Lasso relies on a novel multi-client-selfish bandit policy design.
Experiments demonstrate the effectiveness of the proposed algorithms on both synthetic and real-world datasets.
- Score: 24.856596181768364
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a novel federated linear contextual bandits model, where
individual clients face different K-armed stochastic bandits with
high-dimensional decision context and coupled through common global parameters.
By leveraging the sparsity structure of the linear reward , a collaborative
algorithm named \texttt{Fedego Lasso} is proposed to cope with the
heterogeneity across clients without exchanging local decision context vectors
or raw reward data. \texttt{Fedego Lasso} relies on a novel multi-client
teamwork-selfish bandit policy design, and achieves near-optimal regrets for
shared parameter cases with logarithmic communication costs. In addition, a new
conceptual tool called federated-egocentric policies is introduced to delineate
exploration-exploitation trade-off. Experiments demonstrate the effectiveness
of the proposed algorithms on both synthetic and real-world datasets.
Related papers
- Online Clustering of Dueling Bandits [59.09590979404303]
We introduce the first "clustering of dueling bandit algorithms" to enable collaborative decision-making based on preference feedback.
We propose two novel algorithms: (1) Clustering of Linear Dueling Bandits (COLDB) which models the user reward functions as linear functions of the context vectors, and (2) Clustering of Neural Dueling Bandits (CONDB) which uses a neural network to model complex, non-linear user reward functions.
arXiv Detail & Related papers (2025-02-04T07:55:41Z) - Stochastic Linear Bandits with Latent Heterogeneity [8.981251210938787]
We propose a novel latent heterogeneous bandit framework that explicitly models this unobserved heterogeneity in customer responses.
Our methodology introduces an innovative algorithm that simultaneously learns latent group memberships and group-specific reward functions.
arXiv Detail & Related papers (2025-02-01T13:02:21Z) - Personalized Federated X -armed Bandit [44.7483638955294]
We study the personalized federated $mathcalX$-armed bandit problem, where the heterogeneous local objectives of the clients are optimized simultaneously in the federated learning paradigm.
We propose the textttPF-PNE algorithm with a unique double elimination strategy, which safely eliminates the non-optimal regions while encouraging federated collaboration through biased but effective evaluations of the local objectives.
arXiv Detail & Related papers (2023-10-25T03:11:32Z) - Federated Learning for Heterogeneous Bandits with Unobserved Contexts [0.0]
We study the problem of federated multi-arm contextual bandits with unknown contexts.
We propose an elimination-based algorithm and prove the regret bound for linearly parametrized reward functions.
arXiv Detail & Related papers (2023-03-29T22:06:24Z) - Personalizing Federated Learning with Over-the-Air Computations [84.8089761800994]
Federated edge learning is a promising technology to deploy intelligence at the edge of wireless networks in a privacy-preserving manner.
Under such a setting, multiple clients collaboratively train a global generic model under the coordination of an edge server.
This paper presents a distributed training paradigm that employs analog over-the-air computation to address the communication bottleneck.
arXiv Detail & Related papers (2023-02-24T08:41:19Z) - Contextual bandits with concave rewards, and an application to fair
ranking [108.48223948875685]
We present the first algorithm with provably vanishing regret for Contextual Bandits with Concave Rewards (CBCR)
We derive a novel reduction from the CBCR regret to the regret of a scalar-reward problem.
Motivated by fairness in recommendation, we describe a special case of CBCR with rankings and fairness-aware objectives.
arXiv Detail & Related papers (2022-10-18T16:11:55Z) - Straggler-Resilient Personalized Federated Learning [55.54344312542944]
Federated learning allows training models from samples distributed across a large network of clients while respecting privacy and communication restrictions.
We develop a novel algorithmic procedure with theoretical speedup guarantees that simultaneously handles two of these hurdles.
Our method relies on ideas from representation learning theory to find a global common representation using all clients' data and learn a user-specific set of parameters leading to a personalized solution for each client.
arXiv Detail & Related papers (2022-06-05T01:14:46Z) - On the Convergence of Clustered Federated Learning [57.934295064030636]
In a federated learning system, the clients, e.g. mobile devices and organization participants, usually have different personal preferences or behavior patterns.
This paper proposes a novel weighted client-based clustered FL algorithm to leverage the client's group and each client in a unified optimization framework.
arXiv Detail & Related papers (2022-02-13T02:39:19Z) - Federated Linear Contextual Bandits [17.438169791449834]
Fed-PE is proposed to cope with the heterogeneity across clients without exchanging local feature vectors or raw data.
Experiments demonstrate the effectiveness of the proposed algorithms on both synthetic and real-world datasets.
arXiv Detail & Related papers (2021-10-27T05:18:58Z) - When and Whom to Collaborate with in a Changing Environment: A
Collaborative Dynamic Bandit Solution [36.76450390135742]
Collaborative bandit algorithms utilize collaborative filtering techniques to improve sample efficiency in online interactive recommendation.
All existing collaborative bandit learning solutions impose a stationary assumption about the environment.
We develop a collaborative dynamic bandit solution to handle changing environment for recommendation.
arXiv Detail & Related papers (2021-04-14T22:15:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.