Federated Split BERT for Heterogeneous Text Classification
- URL: http://arxiv.org/abs/2205.13299v1
- Date: Thu, 26 May 2022 12:21:57 GMT
- Title: Federated Split BERT for Heterogeneous Text Classification
- Authors: Zhengyang Li, Shijing Si, Jianzong Wang and Jing Xiao
- Abstract summary: We propose a framework, FedSplitBERT, which handles heterogeneous data and decreases the communication cost by splitting the BERT encoder layers into local part and global part.
Our framework is ready-to-use and compatible to many existing federated learning algorithms, including FedAvg, FedProx and FedAdam.
- Score: 25.388324221293203
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained BERT models have achieved impressive performance in many natural
language processing (NLP) tasks. However, in many real-world situations,
textual data are usually decentralized over many clients and unable to be
uploaded to a central server due to privacy protection and regulations.
Federated learning (FL) enables multiple clients collaboratively to train a
global model while keeping the local data privacy. A few researches have
investigated BERT in federated learning setting, but the problem of performance
loss caused by heterogeneous (e.g., non-IID) data over clients remain
under-explored. To address this issue, we propose a framework, FedSplitBERT,
which handles heterogeneous data and decreases the communication cost by
splitting the BERT encoder layers into local part and global part. The local
part parameters are trained by the local client only while the global part
parameters are trained by aggregating gradients of multiple clients. Due to the
sheer size of BERT, we explore a quantization method to further reduce the
communication cost with minimal performance loss. Our framework is ready-to-use
and compatible to many existing federated learning algorithms, including
FedAvg, FedProx and FedAdam. Our experiments verify the effectiveness of the
proposed framework, which outperforms baseline methods by a significant margin,
while FedSplitBERT with quantization can reduce the communication cost by
$11.9\times$.
Related papers
- Boosting Federated Learning with FedEntOpt: Mitigating Label Skew by Entropy-Based Client Selection [13.851391819710367]
Deep learning domains typically require an extensive amount of data for optimal performance.
FedEntOpt is designed to mitigate performance issues caused by label distribution skew.
It exhibits robust and superior performance in scenarios with low participation rates and client dropout.
arXiv Detail & Related papers (2024-11-02T13:31:36Z) - Personalized federated learning based on feature fusion [2.943623084019036]
Federated learning enables distributed clients to collaborate on training while storing their data locally to protect client privacy.
We propose a personalized federated learning approach called pFedPM.
In our process, we replace traditional gradient uploading with feature uploading, which helps reduce communication costs and allows for heterogeneous client models.
arXiv Detail & Related papers (2024-06-24T12:16:51Z) - Federated Learning under Partially Class-Disjoint Data via Manifold Reshaping [64.58402571292723]
We propose a manifold reshaping approach called FedMR to calibrate the feature space of local training.
We conduct extensive experiments on a range of datasets to demonstrate that our FedMR achieves much higher accuracy and better communication efficiency.
arXiv Detail & Related papers (2024-05-29T10:56:13Z) - FedLPA: One-shot Federated Learning with Layer-Wise Posterior Aggregation [7.052566906745796]
FedLPA is a layer-wise posterior aggregation method for federated learning.
We show that FedLPA significantly improves learning performance over state-of-the-art methods across several metrics.
arXiv Detail & Related papers (2023-09-30T10:51:27Z) - Towards Instance-adaptive Inference for Federated Learning [80.38701896056828]
Federated learning (FL) is a distributed learning paradigm that enables multiple clients to learn a powerful global model by aggregating local training.
In this paper, we present a novel FL algorithm, i.e., FedIns, to handle intra-client data heterogeneity by enabling instance-adaptive inference in the FL framework.
Our experiments show that our FedIns outperforms state-of-the-art FL algorithms, e.g., a 6.64% improvement against the top-performing method with less than 15% communication cost on Tiny-ImageNet.
arXiv Detail & Related papers (2023-08-11T09:58:47Z) - Communication Efficient Federated Learning for Multilingual Neural
Machine Translation with Adapter [21.512817959760007]
Federated Multilingual Neural Machine Translation (Fed-MNMT) has emerged as a promising paradigm for institutions with limited language resources.
This approach allows multiple institutions to act as clients and train a unified model through model synchronization, rather than collecting sensitive data for centralized training.
However, as pre-trained language models (PLMs) continue to increase in size, the communication cost for transmitting parameters during synchronization has become a training speed bottleneck.
We propose a communication-efficient Fed-MNMT framework that addresses this issue by keeping PLMs frozen and only transferring lightweight adapter modules between clients.
arXiv Detail & Related papers (2023-05-21T12:48:38Z) - DisPFL: Towards Communication-Efficient Personalized Federated Learning
via Decentralized Sparse Training [84.81043932706375]
We propose a novel personalized federated learning framework in a decentralized (peer-to-peer) communication protocol named Dis-PFL.
Dis-PFL employs personalized sparse masks to customize sparse local models on the edge.
We demonstrate that our method can easily adapt to heterogeneous local clients with varying computation complexities.
arXiv Detail & Related papers (2022-06-01T02:20:57Z) - FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling
and Correction [48.85303253333453]
Federated learning (FL) allows multiple clients to collectively train a high-performance global model without sharing their private data.
We propose a novel federated learning algorithm with local drift decoupling and correction (FedDC)
Our FedDC only introduces lightweight modifications in the local training phase, in which each client utilizes an auxiliary local drift variable to track the gap between the local model parameter and the global model parameters.
Experiment results and analysis demonstrate that FedDC yields expediting convergence and better performance on various image classification tasks.
arXiv Detail & Related papers (2022-03-22T14:06:26Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - Stochastic Coded Federated Learning with Convergence and Privacy
Guarantees [8.2189389638822]
Federated learning (FL) has attracted much attention as a privacy-preserving distributed machine learning framework.
This paper proposes a coded federated learning framework, namely coded federated learning (SCFL) to mitigate the straggler issue.
We characterize the privacy guarantee by the mutual information differential privacy (MI-DP) and analyze the convergence performance in federated learning.
arXiv Detail & Related papers (2022-01-25T04:43:29Z) - Scotch: An Efficient Secure Computation Framework for Secure Aggregation [0.0]
Federated learning enables multiple data owners to jointly train a machine learning model without revealing their private datasets.
A malicious aggregation server might use the model parameters to derive sensitive information about the training dataset used.
We propose textscScotch, a decentralized textitm-party secure-computation framework for federated aggregation.
arXiv Detail & Related papers (2022-01-19T17:16:35Z) - Federated Multi-Target Domain Adaptation [99.93375364579484]
Federated learning methods enable us to train machine learning models on distributed user data while preserving its privacy.
We consider a more practical scenario where the distributed client data is unlabeled, and a centralized labeled dataset is available on the server.
We propose an effective DualAdapt method to address the new challenges.
arXiv Detail & Related papers (2021-08-17T17:53:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.