A Randomized Zeroth-Order Hierarchical Framework for Heterogeneous Federated Learning
- URL: http://arxiv.org/abs/2504.01839v1
- Date: Wed, 02 Apr 2025 15:44:59 GMT
- Title: A Randomized Zeroth-Order Hierarchical Framework for Heterogeneous Federated Learning
- Authors: Yuyang Qiu, Kibaek Kim, Farzad Yousefian,
- Abstract summary: Heterogeneity in federated learning (FL) is a critical and challenging aspect that significantly impacts model performance and convergence.<n>We propose a novel framework by formulating heterogeneous FL as a hierarchical optimization problem.<n>We implement our method on image classification tasks and compare with other methods under different heterogeneous settings.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Heterogeneity in federated learning (FL) is a critical and challenging aspect that significantly impacts model performance and convergence. In this paper, we propose a novel framework by formulating heterogeneous FL as a hierarchical optimization problem. This new framework captures both local and global training process through a bilevel formulation and is capable of the following: (i) addressing client heterogeneity through a personalized learning framework; (ii) capturing pre-training process on server's side; (iii) updating global model through nonstandard aggregation; (iv) allowing for nonidentical local steps; and (v) capturing clients' local constraints. We design and analyze an implicit zeroth-order FL method (ZO-HFL), provided with nonasymptotic convergence guarantees for both the server-agent and the individual client-agents, and asymptotic guarantees for both the server-agent and client-agents in an almost sure sense. Notably, our method does not rely on standard assumptions in heterogeneous FL, such as the bounded gradient dissimilarity condition. We implement our method on image classification tasks and compare with other methods under different heterogeneous settings.
Related papers
- Client-Centric Federated Adaptive Optimization [78.30827455292827]
Federated Learning (FL) is a distributed learning paradigm where clients collaboratively train a model while keeping their own data private.<n>We propose Federated-Centric Adaptive Optimization, which is a class of novel federated optimization approaches.
arXiv Detail & Related papers (2025-01-17T04:00:50Z) - Towards Hyper-parameter-free Federated Learning [1.3682156035049038]
We introduce algorithms for automated scaling of global model updates.
In first algorithm, we establish that a descent-ensuring step-size regime at the clients ensures descent for the server objective.
Second algorithm shows that the average of objective values of sampled clients is a practical and effective substitute for the value server required for computing the scaling factor.
arXiv Detail & Related papers (2024-08-30T09:35:36Z) - Pursuing Overall Welfare in Federated Learning through Sequential Decision Making [10.377683220196873]
In traditional federated learning, a single global model cannot perform equally well for all clients.
Our work reveals that existing fairness-aware aggregation strategies can be unified into an online convex optimization framework.
AAggFF achieves better degree of client-level fairness than existing methods in both practical settings.
arXiv Detail & Related papers (2024-05-31T14:15:44Z) - Decentralized Sporadic Federated Learning: A Unified Algorithmic Framework with Convergence Guarantees [18.24213566328972]
Decentralized learning computation (DFL) captures FL settings where both (i) model updates and (ii) model aggregations are carried out by the clients without a central server.<n>$textttDSpodFL$, a DFL methodology built on a generalized notion of $textitsporadicity$ in both local gradient and aggregation processes.<n>$textttDSpodFL$ consistently achieves improved speeds compared with baselines under various system settings.
arXiv Detail & Related papers (2024-02-05T19:02:19Z) - Rethinking Client Drift in Federated Learning: A Logit Perspective [125.35844582366441]
Federated Learning (FL) enables multiple clients to collaboratively learn in a distributed way, allowing for privacy protection.
We find that the difference in logits between the local and global models increases as the model is continuously updated.
We propose a new algorithm, named FedCSD, a Class prototype Similarity Distillation in a federated framework to align the local and global models.
arXiv Detail & Related papers (2023-08-20T04:41:01Z) - Towards Instance-adaptive Inference for Federated Learning [80.38701896056828]
Federated learning (FL) is a distributed learning paradigm that enables multiple clients to learn a powerful global model by aggregating local training.
In this paper, we present a novel FL algorithm, i.e., FedIns, to handle intra-client data heterogeneity by enabling instance-adaptive inference in the FL framework.
Our experiments show that our FedIns outperforms state-of-the-art FL algorithms, e.g., a 6.64% improvement against the top-performing method with less than 15% communication cost on Tiny-ImageNet.
arXiv Detail & Related papers (2023-08-11T09:58:47Z) - Provably Personalized and Robust Federated Learning [47.50663360022456]
We propose simple algorithms which identify clusters of similar clients and train a personalized modelper-cluster.
The convergence rates of our algorithmsally match those obtained if we knew the true underlying clustering of the clients and are provably robust in the Byzantine setting.
arXiv Detail & Related papers (2023-06-14T09:37:39Z) - FedHB: Hierarchical Bayesian Federated Learning [11.936836827864095]
We propose a novel hierarchical Bayesian approach to Federated Learning (FL)
Our model reasonably describes the generative process of clients' local data via hierarchical Bayesian modeling.
We show that our block-coordinate FL algorithm converges to an optimum of the objective at the rate of $O(sqrtt)$.
arXiv Detail & Related papers (2023-05-08T18:21:41Z) - Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients.
FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification.
Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z) - Disentangled Federated Learning for Tackling Attributes Skew via
Invariant Aggregation and Diversity Transferring [104.19414150171472]
Attributes skews the current federated learning (FL) frameworks from consistent optimization directions among the clients.
We propose disentangled federated learning (DFL) to disentangle the domain-specific and cross-invariant attributes into two complementary branches.
Experiments verify that DFL facilitates FL with higher performance, better interpretability, and faster convergence rate, compared with SOTA FL methods.
arXiv Detail & Related papers (2022-06-14T13:12:12Z) - On the Convergence of Heterogeneous Federated Learning with Arbitrary
Adaptive Online Model Pruning [15.300983585090794]
We present a unifying framework for heterogeneous FL algorithms with em arbitrary adaptive online model pruning.
In particular, we prove that under certain sufficient conditions, these algorithms converge to a stationary point of standard FL for general smooth cost functions.
We illuminate two key factors impacting convergence: pruning-induced noise and minimum coverage index.
arXiv Detail & Related papers (2022-01-27T20:43:38Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.