Related papers: Reward Teaching for Federated Multi-armed Bandits

Reward Teaching for Federated Multi-armed Bandits

URL: http://arxiv.org/abs/2305.02441v2
Date: Mon, 20 Nov 2023 15:27:37 GMT
Title: Reward Teaching for Federated Multi-armed Bandits
Authors: Chengshuai Shi, Wei Xiong, Cong Shen, Jing Yang
Abstract summary: This work focuses on clients who always maximize their individual cumulative rewards, and introduces a novel idea of reward teaching'' A phased approach, called Teaching-After-Learning (TAL), is first designed to encourage and discourage clients' explorations separately. Rigorous analyses demonstrate that when facing clients with UCB1, TWL outperforms TAL in terms of the dependencies on sub-optimality gaps.
Score: 18.341280891539746
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Most of the existing federated multi-armed bandits (FMAB) designs are based on the presumption that clients will implement the specified design to collaborate with the server. In reality, however, it may not be possible to modify the clients' existing protocols. To address this challenge, this work focuses on clients who always maximize their individual cumulative rewards, and introduces a novel idea of ``reward teaching'', where the server guides the clients towards global optimality through implicit local reward adjustments. Under this framework, the server faces two tightly coupled tasks of bandit learning and target teaching, whose combination is non-trivial and challenging. A phased approach, called Teaching-After-Learning (TAL), is first designed to encourage and discourage clients' explorations separately. General performance analyses of TAL are established when the clients' strategies satisfy certain mild requirements. With novel technical approaches developed to analyze the warm-start behaviors of bandit algorithms, particularized guarantees of TAL with clients running UCB or epsilon-greedy strategies are then obtained. These results demonstrate that TAL achieves logarithmic regrets while only incurring logarithmic adjustment costs, which is order-optimal w.r.t. a natural lower bound. As a further extension, the Teaching-While-Learning (TWL) algorithm is developed with the idea of successive arm elimination to break the non-adaptive phase separation in TAL. Rigorous analyses demonstrate that when facing clients with UCB1, TWL outperforms TAL in terms of the dependencies on sub-optimality gaps thanks to its adaptive design. Experimental results demonstrate the effectiveness and generality of the proposed algorithms.

Related papers

COPO: Consistency-Aware Policy Optimization [17.328515578426227]
Reinforcement learning has significantly enhanced the reasoning capabilities of Large Language Models (LLMs) in complex problem-solving tasks.<n>Recently, the introduction of DeepSeek R1 has inspired a surge of interest in leveraging rule-based rewards as a low-cost alternative for computing advantage functions and guiding policy optimization.<n>We propose a consistency-aware policy optimization framework that introduces a structured global reward based on outcome consistency.
arXiv Detail & Related papers (2025-08-06T07:05:18Z)
Adaptive collaboration for online personalized distributed learning with heterogeneous clients [22.507916490976044]
We study the problem of online personalized learning with $N$ statistically heterogeneous clients collaborating to accelerate local training.<n>An important challenge in this setting is to select relevant collaborators to reduce variance while mitigating the introduced bias.
arXiv Detail & Related papers (2025-07-09T13:44:27Z)
Exact and Linear Convergence for Federated Learning under Arbitrary Client Participation is Attainable [9.870718388000645]
This work tackles the fundamental challenges in Federated Learning (FL)<n>It is well-established that popular FedAvg-style algorithms struggle with exact convergence.<n>We present FOCUS, Federated Optimization with Exact Convergence via Push-pull Strategy, a provably convergent algorithm.
arXiv Detail & Related papers (2025-03-25T23:54:23Z)
Client-Centric Federated Adaptive Optimization [78.30827455292827]
Federated Learning (FL) is a distributed learning paradigm where clients collaboratively train a model while keeping their own data private. We propose Federated-Centric Adaptive Optimization, which is a class of novel federated optimization approaches.
arXiv Detail & Related papers (2025-01-17T04:00:50Z)
Submodular Maximization Approaches for Equitable Client Selection in Federated Learning [4.167345675621377]
In a conventional Learning framework, client selection for training typically involves the random sampling of a subset of clients in each iteration. This paper introduces two novel methods, namely SUBTRUNC and UNIONFL, designed to address the limitations of random client selection.
arXiv Detail & Related papers (2024-08-24T22:40:31Z)
Emulating Full Client Participation: A Long-Term Client Selection Strategy for Federated Learning [48.94952630292219]
We propose a novel client selection strategy designed to emulate the performance achieved with full client participation. In a single round, we select clients by minimizing the gradient-space estimation error between the client subset and the full client set. In multi-round selection, we introduce a novel individual fairness constraint, which ensures that clients with similar data distributions have similar frequencies of being selected.
arXiv Detail & Related papers (2024-05-22T12:27:24Z)
FedCAda: Adaptive Client-Side Optimization for Accelerated and Stable Federated Learning [57.38427653043984]
Federated learning (FL) has emerged as a prominent approach for collaborative training of machine learning models across distributed clients. We introduce FedCAda, an innovative federated client adaptive algorithm designed to tackle this challenge. We demonstrate that FedCAda outperforms the state-of-the-art methods in terms of adaptability, convergence, stability, and overall performance.
arXiv Detail & Related papers (2024-05-20T06:12:33Z)
Adversarial Batch Inverse Reinforcement Learning: Learn to Reward from Imperfect Demonstration for Interactive Recommendation [23.048841953423846]
We focus on the problem of learning to reward, which is fundamental to reinforcement learning. Previous approaches either introduce additional procedures for learning to reward, thereby increasing the complexity of optimization. We propose a novel batch inverse reinforcement learning paradigm that achieves the desired properties.
arXiv Detail & Related papers (2023-10-30T13:43:20Z)
Personalized Federated Learning via Amortized Bayesian Meta-Learning [21.126405589760367]
We introduce a new perspective on personalized federated learning through Amortized Bayesian Meta-Learning. Specifically, we propose a novel algorithm called emphFedABML, which employs hierarchical variational inference across clients. Our theoretical analysis provides an upper bound on the average generalization error and guarantees the generalization performance on unseen data.
arXiv Detail & Related papers (2023-07-05T11:58:58Z)
Provably Personalized and Robust Federated Learning [47.50663360022456]
We propose simple algorithms which identify clusters of similar clients and train a personalized modelper-cluster. The convergence rates of our algorithmsally match those obtained if we knew the true underlying clustering of the clients and are provably robust in the Byzantine setting.
arXiv Detail & Related papers (2023-06-14T09:37:39Z)
FilFL: Client Filtering for Optimized Client Participation in Federated Learning [71.46173076298957]
Federated learning enables clients to collaboratively train a model without exchanging local data. Clients participating in the training process significantly impact the convergence rate, learning efficiency, and model generalization. We propose a novel approach, client filtering, to improve model generalization and optimize client participation and training.
arXiv Detail & Related papers (2023-02-13T18:55:31Z)
Fed-CBS: A Heterogeneity-Aware Client Sampling Mechanism for Federated Learning via Class-Imbalance Reduction [76.26710990597498]
We show that the class-imbalance of the grouped data from randomly selected clients can lead to significant performance degradation. Based on our key observation, we design an efficient client sampling mechanism, i.e., Federated Class-balanced Sampling (Fed-CBS) In particular, we propose a measure of class-imbalance and then employ homomorphic encryption to derive this measure in a privacy-preserving way.
arXiv Detail & Related papers (2022-09-30T05:42:56Z)
Straggler-Resilient Personalized Federated Learning [55.54344312542944]
Federated learning allows training models from samples distributed across a large network of clients while respecting privacy and communication restrictions. We develop a novel algorithmic procedure with theoretical speedup guarantees that simultaneously handles two of these hurdles. Our method relies on ideas from representation learning theory to find a global common representation using all clients' data and learn a user-specific set of parameters leading to a personalized solution for each client.
arXiv Detail & Related papers (2022-06-05T01:14:46Z)
Faster Non-Convex Federated Learning via Global and Local Momentum [57.52663209739171]
textttFedGLOMO is the first (first-order) FLtexttFedGLOMO algorithm. Our algorithm is provably optimal even with communication between the clients and the server.
arXiv Detail & Related papers (2020-12-07T21:05:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.