Related papers: Exploring Selective Layer Fine-Tuning in Federated Learning

Exploring Selective Layer Fine-Tuning in Federated Learning

URL: http://arxiv.org/abs/2408.15600v2
Date: Thu, 26 Sep 2024 10:26:18 GMT
Title: Exploring Selective Layer Fine-Tuning in Federated Learning
Authors: Yuchang Sun, Yuexiang Xie, Bolin Ding, Yaliang Li, Jun Zhang,
Abstract summary: Federated learning (FL) has emerged as a promising paradigm for fine-tuning foundation models using distributed data. We study selective layer fine-tuning in FL, emphasizing a flexible approach that allows the clients to adjust their selected layers according to their local data and resources.
Score: 48.470385357429215
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Federated learning (FL) has emerged as a promising paradigm for fine-tuning foundation models using distributed data in a privacy-preserving manner. Under limited computational resources, clients often find it more practical to fine-tune a selected subset of layers, rather than the entire model, based on their task-specific data. In this study, we provide a thorough theoretical exploration of selective layer fine-tuning in FL, emphasizing a flexible approach that allows the clients to adjust their selected layers according to their local data and resources. We theoretically demonstrate that the layer selection strategy has a significant impact on model convergence in two critical aspects: the importance of selected layers and the heterogeneous choices across clients. Drawing from these insights, we further propose a strategic layer selection method that utilizes local gradients and regulates layer selections across clients. The extensive experiments on both image and text datasets demonstrate the effectiveness of the proposed strategy compared with several baselines, highlighting its advances in identifying critical layers that adapt to the client heterogeneity and training dynamics in FL.

Related papers

Entropy Guided Extrapolative Decoding to Improve Factuality in Large Language Models [55.45444773200529]
Large language models (LLMs) exhibit impressive natural language capabilities but suffer from hallucination. Recent work has focused on decoding techniques to improve factuality during inference.
arXiv Detail & Related papers (2024-04-14T19:45:35Z)
Accelerating Federated Learning by Selecting Beneficial Herd of Local Gradients [40.84399531998246]
Federated Learning (FL) is a distributed machine learning framework in communication network systems. Non-Independent and Identically Distributed (Non-IID) data negatively affect the convergence efficiency of the global model. We propose the BHerd strategy which selects a beneficial herd of local gradients to accelerate the convergence of the FL model.
arXiv Detail & Related papers (2024-03-25T09:16:59Z)
Annotation-Efficient Polyp Segmentation via Active Learning [45.59503015577479]
We propose a deep active learning framework for annotation-efficient polyp segmentation. In practice, we measure the uncertainty of each sample by examining the similarity between features masked by the prediction map of the polyp and the background area. We show that our proposed method achieved state-of-the-art performance compared to other competitors on both a public dataset and a large-scale in-house dataset.
arXiv Detail & Related papers (2024-03-21T12:25:17Z)
Towards Optimal Customized Architecture for Heterogeneous Federated Learning with Contrastive Cloud-Edge Model Decoupling [20.593232086762665]
Federated learning, as a promising distributed learning paradigm, enables collaborative training of a global model across multiple network edge clients without the need for central data collecting. We propose a novel federated learning framework called FedCMD, a model decoupling tailored to the Cloud-edge supported federated learning. Our motivation is that, by the deep investigation of the performance of selecting different neural network layers as the personalized head, we found rigidly assigning the last layer as the personalized head in current studies is not always optimal.
arXiv Detail & Related papers (2024-03-04T05:10:28Z)
Addressing Membership Inference Attack in Federated Learning with Model Compression [8.842172558292027]
Federated Learning (FL) has been proposed as a privacy-preserving solution for machine learning. Recent works have reported that FL can leak private client data through membership inference attacks. We show that effectiveness of these attacks negatively correlates with the size of the client's datasets and model complexity.
arXiv Detail & Related papers (2023-11-29T15:54:15Z)
Learning the Right Layers: a Data-Driven Layer-Aggregation Strategy for Semi-Supervised Learning on Multilayer Graphs [2.752817022620644]
Clustering (or community detection) on multilayer graphs poses several additional complications. One of the major challenges is to establish the extent to which each layer contributes to the cluster iteration assignment. We propose a parameter-free Laplacian-regularized model that learns an optimal nonlinear combination of the different layers from the available input labels.
arXiv Detail & Related papers (2023-05-31T19:50:11Z)
Straggler-Resilient Personalized Federated Learning [55.54344312542944]
Federated learning allows training models from samples distributed across a large network of clients while respecting privacy and communication restrictions. We develop a novel algorithmic procedure with theoretical speedup guarantees that simultaneously handles two of these hurdles. Our method relies on ideas from representation learning theory to find a global common representation using all clients' data and learn a user-specific set of parameters leading to a personalized solution for each client.
arXiv Detail & Related papers (2022-06-05T01:14:46Z)
Enhancing Prototypical Few-Shot Learning by Leveraging the Local-Level Strategy [75.63022284445945]
We find that the existing works often build their few-shot model based on the image-level feature by mixing all local-level features. We present (a) a local-agnostic training strategy to avoid the discriminative location bias between the base and novel categories, and (b) a novel local-level similarity measure to capture the accurate comparison between local-level features.
arXiv Detail & Related papers (2021-11-08T08:45:15Z)
Deep Learning feature selection to unhide demographic recommender systems factors [63.732639864601914]
The matrix factorization model generates factors which do not incorporate semantic knowledge. DeepUnHide is able to extract demographic information from the users and items factors in collaborative filtering recommender systems.
arXiv Detail & Related papers (2020-06-17T17:36:48Z)
Attentive CutMix: An Enhanced Data Augmentation Approach for Deep Learning Based Image Classification [58.20132466198622]
We propose Attentive CutMix, a naturally enhanced augmentation strategy based on CutMix. In each training iteration, we choose the most descriptive regions based on the intermediate attention maps from a feature extractor. Our proposed method is simple yet effective, easy to implement and can boost the baseline significantly.
arXiv Detail & Related papers (2020-03-29T15:01:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.