Visual Transformer Meets CutMix for Improved Accuracy, Communication
Efficiency, and Data Privacy in Split Learning
- URL: http://arxiv.org/abs/2207.00234v1
- Date: Fri, 1 Jul 2022 07:00:30 GMT
- Title: Visual Transformer Meets CutMix for Improved Accuracy, Communication
Efficiency, and Data Privacy in Split Learning
- Authors: Sihun Baek, Jihong Park, Praneeth Vepakomma, Ramesh Raskar, Mehdi
Bennis, Seong-Lyun Kim
- Abstract summary: This article seeks for a distributed learning solution for the visual transformer (ViT) architectures.
ViTs often have larger model sizes, and are computationally expensive, making federated learning (FL) ill-suited.
We propose a new form of CutSmashed data by randomly punching and compressing the original smashed data.
We develop a novel SL framework for ViT, coined CutMixSL, communicating CutSmashed data.
- Score: 47.266470238551314
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This article seeks for a distributed learning solution for the visual
transformer (ViT) architectures. Compared to convolutional neural network (CNN)
architectures, ViTs often have larger model sizes, and are computationally
expensive, making federated learning (FL) ill-suited. Split learning (SL) can
detour this problem by splitting a model and communicating the hidden
representations at the split-layer, also known as smashed data.
Notwithstanding, the smashed data of ViT are as large as and as similar as the
input data, negating the communication efficiency of SL while violating data
privacy. To resolve these issues, we propose a new form of CutSmashed data by
randomly punching and compressing the original smashed data. Leveraging this,
we develop a novel SL framework for ViT, coined CutMixSL, communicating
CutSmashed data. CutMixSL not only reduces communication costs and privacy
leakage, but also inherently involves the CutMix data augmentation, improving
accuracy and scalability. Simulations corroborate that CutMixSL outperforms
baselines such as parallelized SL and SplitFed that integrates FL with SL.
Related papers
- Privacy-Preserving Split Learning with Vision Transformers using Patch-Wise Random and Noisy CutMix [38.370923655357366]
In computer vision, the vision transformer (ViT) has increasingly superseded the convolutional neural network (CNN) for improved accuracy and robustness.
Split learning (SL) emerges as a viable solution, leveraging server-side resources to train ViTs while utilizing private data from distributed devices.
We propose a novel privacy-preserving SL framework that injects Gaussian noise into smashed data and mixes randomly chosen patches of smashed data across clients, coined DP-CutMixSL.
arXiv Detail & Related papers (2024-08-02T06:24:39Z) - Communication and Storage Efficient Federated Split Learning [19.369076939064904]
Federated Split Learning preserves the parallel model training principle of FL.
Server has to maintain separate models for every client, resulting in a significant computation and storage requirement.
This paper proposes a communication and storage efficient federated and split learning strategy.
arXiv Detail & Related papers (2023-02-11T04:44:29Z) - Robust Split Federated Learning for U-shaped Medical Image Networks [16.046153872932653]
We propose Robust Split Federated Learning (RoS-FL) for U-shaped medical image networks.
RoS-FL is a novel hybrid learning paradigm of Federated Learning (FL) and Split Learning (SL)
arXiv Detail & Related papers (2022-12-13T05:26:31Z) - Scalable Collaborative Learning via Representation Sharing [53.047460465980144]
Federated learning (FL) and Split Learning (SL) are two frameworks that enable collaborative learning while keeping the data private (on device)
In FL, each data holder trains a model locally and releases it to a central server for aggregation.
In SL, the clients must release individual cut-layer activations (smashed data) to the server and wait for its response (during both inference and back propagation).
In this work, we present a novel approach for privacy-preserving machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss.
arXiv Detail & Related papers (2022-11-20T10:49:22Z) - Differentially Private CutMix for Split Learning with Vision Transformer [42.47713044228984]
Vision transformer (ViT) has started to outpace the conventional CNN in computer vision tasks.
Considering privacy-preserving distributed learning with ViT, we propose DP-CutMixSL.
arXiv Detail & Related papers (2022-10-28T08:33:29Z) - Secure Forward Aggregation for Vertical Federated Neural Networks [25.059312670812215]
We study SplitNN, a well-known neural network framework in Vertical Federated Learning (VFL)
SplitNN suffers from the loss of model performance since multiply parties jointly train the model using transformed data instead of raw data.
We propose a new neural network protocol in VFL called Security Forward Aggregation (SFA)
Experiment results show that networks with SFA achieve both data security and high model performance.
arXiv Detail & Related papers (2022-06-28T03:13:26Z) - Server-Side Local Gradient Averaging and Learning Rate Acceleration for
Scalable Split Learning [82.06357027523262]
Federated learning (FL) and split learning (SL) are two spearheads possessing their pros and cons, and are suited for many user clients and large models.
In this work, we first identify the fundamental bottlenecks of SL, and thereby propose a scalable SL framework, coined SGLR.
arXiv Detail & Related papers (2021-12-11T08:33:25Z) - Joint Superposition Coding and Training for Federated Learning over
Multi-Width Neural Networks [52.93232352968347]
This paper aims to integrate two synergetic technologies, federated learning (FL) and width-adjustable slimmable neural network (SNN)
FL preserves data privacy by exchanging the locally trained models of mobile devices. SNNs are however non-trivial, particularly under wireless connections with time-varying channel conditions.
We propose a communication and energy-efficient SNN-based FL (named SlimFL) that jointly utilizes superposition coding (SC) for global model aggregation and superposition training (ST) for updating local models.
arXiv Detail & Related papers (2021-12-05T11:17:17Z) - Decoupling Pronunciation and Language for End-to-end Code-switching
Automatic Speech Recognition [66.47000813920617]
We propose a decoupled transformer model to use monolingual paired data and unpaired text data.
The model is decoupled into two parts: audio-to-phoneme (A2P) network and phoneme-to-text (P2T) network.
By using monolingual data and unpaired text data, the decoupled transformer model reduces the high dependency on code-switching paired training data of E2E model.
arXiv Detail & Related papers (2020-10-28T07:46:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.