FedAvg with Fine Tuning: Local Updates Lead to Representation Learning
- URL: http://arxiv.org/abs/2205.13692v1
- Date: Fri, 27 May 2022 00:55:24 GMT
- Title: FedAvg with Fine Tuning: Local Updates Lead to Representation Learning
- Authors: Liam Collins, Hamed Hassani, Aryan Mokhtari, Sanjay Shakkottai
- Abstract summary: Federated Averaging (FedAvg) algorithm consists of alternating between a few local gradient updates at client nodes, followed by a model averaging update at the server.
We show that the reason behind generalizability of the FedAvg's output is its power in learning the common data representation among the clients' tasks.
We also provide empirical evidence demonstrating FedAvg's representation learning ability in federated image classification with heterogeneous data.
- Score: 54.65133770989836
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Federated Averaging (FedAvg) algorithm, which consists of alternating
between a few local stochastic gradient updates at client nodes, followed by a
model averaging update at the server, is perhaps the most commonly used method
in Federated Learning. Notwithstanding its simplicity, several empirical
studies have illustrated that the output model of FedAvg, after a few
fine-tuning steps, leads to a model that generalizes well to new unseen tasks.
This surprising performance of such a simple method, however, is not fully
understood from a theoretical point of view. In this paper, we formally
investigate this phenomenon in the multi-task linear representation setting. We
show that the reason behind generalizability of the FedAvg's output is its
power in learning the common data representation among the clients' tasks, by
leveraging the diversity among client data distributions via local updates. We
formally establish the iteration complexity required by the clients for proving
such result in the setting where the underlying shared representation is a
linear map. To the best of our knowledge, this is the first such result for any
setting. We also provide empirical evidence demonstrating FedAvg's
representation learning ability in federated image classification with
heterogeneous data.
Related papers
- FedImpro: Measuring and Improving Client Update in Federated Learning [77.68805026788836]
Federated Learning (FL) models often experience client drift caused by heterogeneous data.
We present an alternative perspective on client drift and aim to mitigate it by generating improved local models.
arXiv Detail & Related papers (2024-02-10T18:14:57Z) - Prototype Helps Federated Learning: Towards Faster Convergence [38.517903009319994]
Federated learning (FL) is a distributed machine learning technique in which multiple clients cooperate to train a shared model without exchanging their raw data.
In this paper, a prototype-based federated learning framework is proposed, which can achieve better inference performance with only a few changes to the last global iteration of the typical federated learning process.
arXiv Detail & Related papers (2023-03-22T04:06:29Z) - On the effectiveness of partial variance reduction in federated learning
with heterogeneous data [27.527995694042506]
We show that the diversity of the final classification layers across clients impedes the performance of the FedAvg algorithm.
Motivated by this, we propose to correct model by variance reduction only on the final layers.
We demonstrate that this significantly outperforms existing benchmarks at a similar or lower communication cost.
arXiv Detail & Related papers (2022-12-05T11:56:35Z) - Federated Learning with Intermediate Representation Regularization [14.01585596739954]
Federated learning (FL) enables remote clients to collaboratively train a model without exposing their private data.
Previous studies accomplish this by regularizing the distance between the representations learned by the local and global models.
We introduce FedIntR, which provides a more fine-grained regularization by integrating the representations of intermediate layers into the local training process.
arXiv Detail & Related papers (2022-10-28T01:43:55Z) - An Expectation-Maximization Perspective on Federated Learning [75.67515842938299]
Federated learning describes the distributed training of models across multiple clients while keeping the data private on-device.
In this work, we view the server-orchestrated federated learning process as a hierarchical latent variable model where the server provides the parameters of a prior distribution over the client-specific model parameters.
We show that with simple Gaussian priors and a hard version of the well known Expectation-Maximization (EM) algorithm, learning in such a model corresponds to FedAvg, the most popular algorithm for the federated learning setting.
arXiv Detail & Related papers (2021-11-19T12:58:59Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - Exploiting Shared Representations for Personalized Federated Learning [54.65133770989836]
We propose a novel federated learning framework and algorithm for learning a shared data representation across clients and unique local heads for each client.
Our algorithm harnesses the distributed computational power across clients to perform many local-updates with respect to the low-dimensional local parameters for every update of the representation.
This result is of interest beyond federated learning to a broad class of problems in which we aim to learn a shared low-dimensional representation among data distributions.
arXiv Detail & Related papers (2021-02-14T05:36:25Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.