Related papers: Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning

Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning

URL: http://arxiv.org/abs/2210.08090v1
Date: Fri, 14 Oct 2022 20:25:35 GMT
Title: Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning
Authors: John Nguyen, Jianyu Wang, Kshitiz Malik, Maziar Sanjabi, Michael Rabbat
Abstract summary: We study the impact of starting from a pre-trained model in federated learning. Starting from a pre-trained model reduces the training time required to reach a target error rate.
Score: 18.138078314019737
License: http://creativecommons.org/licenses/by/4.0/
Abstract: An oft-cited challenge of federated learning is the presence of heterogeneity. \emph{Data heterogeneity} refers to the fact that data from different clients may follow very different distributions. \emph{System heterogeneity} refers to the fact that client devices have different system capabilities. A considerable number of federated optimization methods address this challenge. In the literature, empirical evaluations usually start federated training from random initialization. However, in many practical applications of federated learning, the server has access to proxy data for the training task that can be used to pre-train a model before starting federated training. We empirically study the impact of starting from a pre-trained model in federated learning using four standard federated learning benchmark datasets. Unsurprisingly, starting from a pre-trained model reduces the training time required to reach a target error rate and enables the training of more accurate models (up to 40\%) than is possible when starting from random initialization. Surprisingly, we also find that starting federated learning from a pre-trained initialization reduces the effect of both data and system heterogeneity. We recommend that future work proposing and evaluating federated optimization methods evaluate the performance when starting from random and pre-trained initializations. We also believe this study raises several questions for further work on understanding the role of heterogeneity in federated optimization.

Related papers

Probabilistic Federated Prompt-Tuning with Non-IID and Imbalanced Data [35.47385526394076]
Fine-tuning pre-trained models is a popular approach in machine learning for solving complex tasks with moderate data. Fine-tuning the entire pre-trained model is ineffective in federated data scenarios where local data distributions are diversely skewed. Our approach transforms federated learning into a distributed set modeling task, aggregating diverse sets of prompts to globally fine-tune the pre-trained model.
arXiv Detail & Related papers (2025-02-27T04:31:34Z)
Accurate Forgetting for Heterogeneous Federated Continual Learning [89.08735771893608]
We propose a new concept accurate forgetting (AF) and develop a novel generative-replay methodMethodwhich selectively utilizes previous knowledge in federated networks. We employ a probabilistic framework based on a normalizing flow model to quantify the credibility of previous knowledge.
arXiv Detail & Related papers (2025-02-20T02:35:17Z)
Initialization Matters: Unraveling the Impact of Pre-Training on Federated Learning [21.440470901377182]
Initializing with pre-trained models is becoming standard practice in machine learning. We study the class of two-layer convolutional neural networks (CNNs) and provide bounds on the training error convergence and test error of such a network trained with FedAvg.
arXiv Detail & Related papers (2025-02-11T23:53:16Z)
Ranking-based Client Selection with Imitation Learning for Efficient Federated Learning [20.412469498888292]
Federated Learning (FL) enables multiple devices to collaboratively train a shared model. The selection of participating devices in each training round critically affects both the model performance and training efficiency. We introduce a novel device selection solution called FedRank, which is an end-to-end, ranking-based approach.
arXiv Detail & Related papers (2024-05-07T08:44:29Z)
Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data. One key challenge in federated learning is to handle non-identically distributed data across the clients. We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z)
FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method. We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate. We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z)
Tackling Computational Heterogeneity in FL: A Few Theoretical Insights [68.8204255655161]
We introduce and analyse a novel aggregation framework that allows for formalizing and tackling computational heterogeneous data. Proposed aggregation algorithms are extensively analyzed from a theoretical, and an experimental prospective.
arXiv Detail & Related papers (2023-07-12T16:28:21Z)
Guiding The Last Layer in Federated Learning with Pre-Trained Models [18.382057374270143]
Federated Learning (FL) is an emerging paradigm that allows a model to be trained across a number of participants without sharing data. We show that fitting a classification head using the Nearest Class Means (NCM) can be done exactly and orders of magnitude more efficiently than existing proposals.
arXiv Detail & Related papers (2023-06-06T18:02:02Z)
When Do Curricula Work in Federated Learning? [56.88941905240137]
We find that curriculum learning largely alleviates non-IIDness. The more disparate the data distributions across clients the more they benefit from learning. We propose a novel client selection technique that benefits from the real-world disparity in the clients.
arXiv Detail & Related papers (2022-12-24T11:02:35Z)
Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning [18.138078314019737]
We study the impact of starting from a pre-trained model in federated learning. Starting from a pre-trained model reduces the training time required to reach a target error rate.
arXiv Detail & Related papers (2022-06-30T16:18:21Z)
Acceleration of Federated Learning with Alleviated Forgetting in Local Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy. We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage. Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z)
FedAUX: Leveraging Unlabeled Auxiliary Data in Federated Learning [14.10627556244287]
Federated Distillation (FD) is a popular novel algorithmic paradigm for Federated Learning. We propose FedAUX, which drastically improves performance by deriving maximum utility from the unlabeled auxiliary data. Experiments on large-scale convolutional neural networks and transformer models demonstrate, that the training performance of FedAUX exceeds SOTA FL baseline methods.
arXiv Detail & Related papers (2021-02-04T09:53:53Z)
Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training. We experimentally verify that the new dataset can significantly improve the ability of the learned FER model. To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.