Related papers: Federated Learning with Layer Skipping: Efficient Training of Large Language Models for Healthcare NLP

Federated Learning with Layer Skipping: Efficient Training of Large Language Models for Healthcare NLP

URL: http://arxiv.org/abs/2504.10536v1
Date: Sun, 13 Apr 2025 07:27:56 GMT
Title: Federated Learning with Layer Skipping: Efficient Training of Large Language Models for Healthcare NLP
Authors: Lihong Zhang, Yue Li,
Abstract summary: Federated learning (FL) enables collaborative model training across organizations without sharing raw data.<n>We propose Layer-Skipping Federated Learning, where only selected layers of a pre-trained LLM are fine-tuned across clients while others remain frozen.
Score: 4.744635045603924
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Federated learning (FL) enables collaborative model training across organizations without sharing raw data, addressing crucial privacy concerns in healthcare natural language processing (NLP). However, training large language models (LLMs) in federated settings faces significant challenges, including communication overhead and data heterogeneity. We propose Layer-Skipping Federated Learning, where only selected layers of a pre-trained LLM are fine-tuned across clients while others remain frozen. Applied to LLaMA 3.2-1B, our approach reduces communication costs by approximately 70% while maintaining performance within 2% of centralized training. We evaluate our method on clinical NER and classification tasks using i2b2 and MIMIC-III datasets. Our experiments demonstrate that Layer-Skipping FL outperforms competitive baselines, handles non-IID clinical data distributions effectively, and shows robustness when combined with differential privacy. This approach represents a practical solution for privacy-preserving collaborative learning in healthcare NLP.

Related papers

Federated Learning for ICD Classification with Lightweight Models and Pretrained Embeddings [0.9668407688201359]
This study investigates the feasibility and performance of federated learning for multi-label ICD code classification.<n>We propose a scalable pipeline combining frozen text embeddings with simple multilayer perceptron (MLP) classifiers.
arXiv Detail & Related papers (2025-07-03T18:58:36Z)
Federated Fine-Tuning of LLMs: Framework Comparison and Research Directions [59.5243730853157]
Federated learning (FL) provides a privacy-preserving solution for fine-tuning pre-trained large language models (LLMs) using distributed private datasets. This article conducts a comparative analysis of three advanced federated LLM (FedLLM) frameworks that integrate knowledge distillation (KD) and split learning (SL) to mitigate these issues.
arXiv Detail & Related papers (2025-01-08T11:37:06Z)
FACMIC: Federated Adaptative CLIP Model for Medical Image Classification [12.166024140377337]
We introduce a federated adaptive Contrastive Language Image Pretraining CLIP model for classification tasks. We employ a light-weight and efficient feature attention module for CLIP that selects suitable features for each client's data. We propose a domain adaptation technique to reduce differences in data distribution between clients.
arXiv Detail & Related papers (2024-10-08T13:24:10Z)
Efficient Continual Pre-training by Mitigating the Stability Gap [68.49269649759005]
We study the behavior of Large Language Models (LLMs) during continual pre-training. We propose three effective strategies to enhance LLM performance within a fixed compute budget. Our strategies improve the average medical task performance of the OpenLlama-3B model from 36.2% to 40.7% with only 40% of the original training budget.
arXiv Detail & Related papers (2024-06-21T02:28:37Z)
FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method. We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate. We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z)
An In-Depth Evaluation of Federated Learning on Biomedical Natural Language Processing [7.412360079707614]
Language models (LMs) have revolutionized natural language processing (NLP) Medical field faces challenges in training LMs due to limited data privacy constraints. In Federated Data (FL) we offer a decentralized solution that enables collaborative learning.
arXiv Detail & Related papers (2023-07-20T22:10:04Z)
Multi-Site Clinical Federated Learning using Recursive and Attentive Models and NVFlare [13.176351544342735]
This paper develops an integrated framework that addresses data privacy and regulatory compliance challenges. It includes the development of an integrated framework that addresses data privacy and regulatory compliance challenges while maintaining elevated accuracy and substantiating the efficacy of the proposed approach.
arXiv Detail & Related papers (2023-06-28T17:00:32Z)
FedPNN: One-shot Federated Classification via Evolving Clustering Method and Probabilistic Neural Network hybrid [4.241208172557663]
We propose a two-stage federated learning approach toward the objective of privacy protection. In the first stage, the synthetic dataset is generated by employing two different distributions as noise. In the second stage, the Federated Probabilistic Neural Network (FedPNN) is developed and employed for building globally shared classification model.
arXiv Detail & Related papers (2023-04-09T03:23:37Z)
FedDBL: Communication and Data Efficient Federated Deep-Broad Learning for Histopathological Tissue Classification [65.7405397206767]
We propose Federated Deep-Broad Learning (FedDBL) to achieve superior classification performance with limited training samples and only one-round communication. FedDBL greatly outperforms the competitors with only one-round communication and limited training samples, while it even achieves comparable performance with the ones under multiple-round communications. Since no data or deep model sharing across different clients, the privacy issue is well-solved and the model security is guaranteed with no model inversion attack risk.
arXiv Detail & Related papers (2023-02-24T14:27:41Z)
Collaborating Heterogeneous Natural Language Processing Tasks via Federated Learning [55.99444047920231]
The proposed ATC framework achieves significant improvements compared with various baseline methods. We conduct extensive experiments on six widely-used datasets covering both Natural Language Understanding (NLU) and Natural Language Generation (NLG) tasks.
arXiv Detail & Related papers (2022-12-12T09:27:50Z)
Federated Contrastive Learning for Volumetric Medical Image Segmentation [16.3860181959878]
Federated learning (FL) can help in this regard by learning a shared model while keeping training data local for privacy. Traditional FL requires fully-labeled data for training, which is inconvenient or sometimes infeasible to obtain. In this work, we propose an FCL framework for volumetric medical image segmentation with limited annotations.
arXiv Detail & Related papers (2022-04-23T03:47:23Z)
FedNLP: A Research Platform for Federated Learning in Natural Language Processing [55.01246123092445]
We present the FedNLP, a research platform for federated learning in NLP. FedNLP supports various popular task formulations in NLP such as text classification, sequence tagging, question answering, seq2seq generation, and language modeling. Preliminary experiments with FedNLP reveal that there exists a large performance gap between learning on decentralized and centralized datasets.
arXiv Detail & Related papers (2021-04-18T11:04:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.