FedGTST: Boosting Global Transferability of Federated Models via Statistics Tuning
- URL: http://arxiv.org/abs/2410.13045v1
- Date: Wed, 16 Oct 2024 21:13:52 GMT
- Title: FedGTST: Boosting Global Transferability of Federated Models via Statistics Tuning
- Authors: Evelyn Ma, Chao Pan, Rasoul Etesami, Han Zhao, Olgica Milenkovic,
- Abstract summary: Federated Learning (FL) addresses issues by facilitating collaborations among clients, expanding the dataset indirectly, distributing computational costs, and preserving privacy.
We propose two enhancements to FL. First, we introduce a client-server exchange protocol that leverages cross-client Jacobian norms to boost transferability.
Second, we increase the average Jacobian norm across clients at the server, using this as a local regularizer to reduce cross-client Jacobian variance.
- Score: 26.093271475139417
- License:
- Abstract: The performance of Transfer Learning (TL) heavily relies on effective pretraining, which demands large datasets and substantial computational resources. As a result, executing TL is often challenging for individual model developers. Federated Learning (FL) addresses these issues by facilitating collaborations among clients, expanding the dataset indirectly, distributing computational costs, and preserving privacy. However, key challenges remain unresolved. First, existing FL methods tend to optimize transferability only within local domains, neglecting the global learning domain. Second, most approaches rely on indirect transferability metrics, which do not accurately reflect the final target loss or true degree of transferability. To address these gaps, we propose two enhancements to FL. First, we introduce a client-server exchange protocol that leverages cross-client Jacobian (gradient) norms to boost transferability. Second, we increase the average Jacobian norm across clients at the server, using this as a local regularizer to reduce cross-client Jacobian variance. Our transferable federated algorithm, termed FedGTST (Federated Global Transferability via Statistics Tuning), demonstrates that increasing the average Jacobian and reducing its variance allows for tighter control of the target loss. This leads to an upper bound on the target loss in terms of the source loss and source-target domain discrepancy. Extensive experiments on datasets such as MNIST to MNIST-M and CIFAR10 to SVHN show that FedGTST outperforms relevant baselines, including FedSR. On the second dataset pair, FedGTST improves accuracy by 9.8% over FedSR and 7.6% over FedIIR when LeNet is used as the backbone.
Related papers
- Achieving Linear Speedup in Asynchronous Federated Learning with
Heterogeneous Clients [30.135431295658343]
Federated learning (FL) aims to learn a common global model without exchanging or transferring the data that are stored locally at different clients.
In this paper, we propose an efficient federated learning (AFL) framework called DeFedAvg.
DeFedAvg is the first AFL algorithm that achieves the desirable linear speedup property, which indicates its high scalability.
arXiv Detail & Related papers (2024-02-17T05:22:46Z) - Risk-Aware Accelerated Wireless Federated Learning with Heterogeneous
Clients [21.104752782245257]
Wireless Federated Learning (FL) is an emerging distributed machine learning paradigm.
This paper proposes a novel risk-aware accelerated FL framework that accounts for the clients heterogeneity in the amount of possessed data.
The proposed scheme is benchmarked against a conservative scheme (i.e., only allowing trustworthy devices) and an aggressive scheme (i.e. oblivious to the trust metric)
arXiv Detail & Related papers (2024-01-17T15:15:52Z) - Efficient Cross-Domain Federated Learning by MixStyle Approximation [0.3277163122167433]
We introduce a privacy-preserving, resource-efficient Federated Learning concept for client adaptation in hardware-constrained environments.
Our approach includes server model pre-training on source data and subsequent fine-tuning on target data via low-end clients.
Preliminary results indicate that our method reduces computational and transmission costs while maintaining competitive performance on downstream tasks.
arXiv Detail & Related papers (2023-12-12T08:33:34Z) - Towards Instance-adaptive Inference for Federated Learning [80.38701896056828]
Federated learning (FL) is a distributed learning paradigm that enables multiple clients to learn a powerful global model by aggregating local training.
In this paper, we present a novel FL algorithm, i.e., FedIns, to handle intra-client data heterogeneity by enabling instance-adaptive inference in the FL framework.
Our experiments show that our FedIns outperforms state-of-the-art FL algorithms, e.g., a 6.64% improvement against the top-performing method with less than 15% communication cost on Tiny-ImageNet.
arXiv Detail & Related papers (2023-08-11T09:58:47Z) - Momentum Benefits Non-IID Federated Learning Simply and Provably [22.800862422479913]
Federated learning is a powerful paradigm for large-scale machine learning.
FedAvg and SCAFFOLD are two prominent algorithms to address these challenges.
This paper explores the utilization of momentum to enhance the performance of FedAvg and SCAFFOLD.
arXiv Detail & Related papers (2023-06-28T18:52:27Z) - PS-FedGAN: An Efficient Federated Learning Framework Based on Partially
Shared Generative Adversarial Networks For Data Privacy [56.347786940414935]
Federated Learning (FL) has emerged as an effective learning paradigm for distributed computation.
This work proposes a novel FL framework that requires only partial GAN model sharing.
Named as PS-FedGAN, this new framework enhances the GAN releasing and training mechanism to address heterogeneous data distributions.
arXiv Detail & Related papers (2023-05-19T05:39:40Z) - FedSkip: Combatting Statistical Heterogeneity with Federated Skip
Aggregation [95.85026305874824]
We introduce a data-driven approach called FedSkip to improve the client optima by periodically skipping federated averaging and scattering local models to the cross devices.
We conduct extensive experiments on a range of datasets to demonstrate that FedSkip achieves much higher accuracy, better aggregation efficiency and competing communication efficiency.
arXiv Detail & Related papers (2022-12-14T13:57:01Z) - FedFM: Anchor-based Feature Matching for Data Heterogeneity in Federated
Learning [91.74206675452888]
We propose a novel method FedFM, which guides each client's features to match shared category-wise anchors.
To achieve higher efficiency and flexibility, we propose a FedFM variant, called FedFM-Lite, where clients communicate with server with fewer synchronization times and communication bandwidth costs.
arXiv Detail & Related papers (2022-10-14T08:11:34Z) - FedFOR: Stateless Heterogeneous Federated Learning with First-Order
Regularization [24.32029125031383]
Federated Learning (FL) seeks to distribute model training across local clients without collecting data in a centralized data-center.
We propose a first-order approximation of the global data distribution into local objectives, which intuitively penalizes updates in the opposite direction of the global update.
Our approach does not impose unrealistic limits on the client size, enabling learning from a large number of clients as is typical in most FL applications.
arXiv Detail & Related papers (2022-09-21T17:57:20Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.