Training Large-Vocabulary Neural Language Models by Private Federated
Learning for Resource-Constrained Devices
- URL: http://arxiv.org/abs/2207.08988v1
- Date: Mon, 18 Jul 2022 23:53:17 GMT
- Title: Training Large-Vocabulary Neural Language Models by Private Federated
Learning for Resource-Constrained Devices
- Authors: Mingbin Xu, Congzheng Song, Ye Tian, Neha Agrawal, Filip Granqvist,
Rogier van Dalen, Xiao Zhang, Arturo Argueta, Shiyi Han, Yaqiao Deng, Leo
Liu, Anmol Walia, Alex Jin
- Abstract summary: Federated Learning (FL) is a technique to train models using data distributed across devices.
Differential Privacy (DP) provides a formal privacy guarantee for sensitive data.
We propose Partial Embedding Updates (PEU) to decrease noise by decreasing payload size.
- Score: 14.604785223644718
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning (FL) is a technique to train models using data distributed
across devices. Differential Privacy (DP) provides a formal privacy guarantee
for sensitive data. Our goal is to train a large neural network language model
(NNLM) on compute-constrained devices while preserving privacy using FL and DP.
However, the DP-noise introduced to the model increases as the model size
grows, which often prevents convergence. We propose Partial Embedding Updates
(PEU), a novel technique to decrease noise by decreasing payload size.
Furthermore, we adopt Low Rank Adaptation (LoRA) and Noise Contrastive
Estimation (NCE) to reduce the memory demands of large models on
compute-constrained devices. This combination of techniques makes it possible
to train large-vocabulary language models while preserving accuracy and
privacy.
Related papers
- Hollowed Net for On-Device Personalization of Text-to-Image Diffusion Models [51.3915762595891]
This paper presents an efficient LoRA-based personalization approach for on-device subject-driven generation.
Our method, termed Hollowed Net, enhances memory efficiency during fine-tuning by modifying the architecture of a diffusion U-Net.
arXiv Detail & Related papers (2024-11-02T08:42:48Z) - Efficient Language Model Architectures for Differentially Private
Federated Learning [21.280600854272716]
Cross-device federated learning (FL) is a technique that trains a model on data distributed across typically millions of edge devices without data leaving the devices.
In centralized training of language models, adaptives are preferred as they offer improved stability and performance.
We propose a scale-in Coupled Input Forget Gate (SI CIFG) recurrent network by modifying the sigmoid and tanh activations in neural recurrent cell.
arXiv Detail & Related papers (2024-03-12T22:21:48Z) - Adaptive Model Pruning and Personalization for Federated Learning over
Wireless Networks [72.59891661768177]
Federated learning (FL) enables distributed learning across edge devices while protecting data privacy.
We consider a FL framework with partial model pruning and personalization to overcome these challenges.
This framework splits the learning model into a global part with model pruning shared with all devices to learn data representations and a personalized part to be fine-tuned for a specific device.
arXiv Detail & Related papers (2023-09-04T21:10:45Z) - Can Public Large Language Models Help Private Cross-device Federated Learning? [58.05449579773249]
We study (differentially) private federated learning (FL) of language models.
Public data has been used to improve privacy-utility trade-offs for both large and small language models.
We propose a novel distribution matching algorithm with theoretical grounding to sample public data close to private data distribution.
arXiv Detail & Related papers (2023-05-20T07:55:58Z) - Joint Privacy Enhancement and Quantization in Federated Learning [23.36363480217293]
Federated learning (FL) is an emerging paradigm for training machine learning models using possibly private data available at edge devices.
We propose a method coined joint privacy enhancement and quantization (JoPEQ)
We show that JoPEQ simultaneously quantizes data according to a required bit-rate while holding a desired privacy level.
arXiv Detail & Related papers (2022-08-23T11:42:58Z) - Federated Split GANs [12.007429155505767]
We propose an alternative approach to train ML models in user's devices themselves.
We focus on GANs (generative adversarial networks) and leverage their inherent privacy-preserving attribute.
Our system preserves data privacy, keeps a short training time, and yields same accuracy of model training in unconstrained devices.
arXiv Detail & Related papers (2022-07-04T23:53:47Z) - On-Device Training Under 256KB Memory [62.95579393237751]
We propose an algorithm-system co-design framework to make on-device training possible with only 256KB of memory.
Our framework is the first solution to enable tiny on-device training of convolutional neural networks under 256KB and 1MB Flash.
arXiv Detail & Related papers (2022-06-30T17:59:08Z) - Don't Generate Me: Training Differentially Private Generative Models
with Sinkhorn Divergence [73.14373832423156]
We propose DP-Sinkhorn, a novel optimal transport-based generative method for learning data distributions from private data with differential privacy.
Unlike existing approaches for training differentially private generative models, we do not rely on adversarial objectives.
arXiv Detail & Related papers (2021-11-01T18:10:21Z) - Enabling On-Device Training of Speech Recognition Models with Federated
Dropout [4.165917555996752]
Federated learning can be used to train machine learning models on the edge on local data that never leave devices.
We propose using federated dropout to reduce the size of client models while training a full-size model server-side.
arXiv Detail & Related papers (2021-10-07T17:22:40Z) - UVeQFed: Universal Vector Quantization for Federated Learning [179.06583469293386]
Federated learning (FL) is an emerging approach to train such learning models without requiring the users to share their possibly private labeled data.
In FL, each user trains its copy of the learning model locally. The server then collects the individual updates and aggregates them into a global model.
We show that combining universal vector quantization methods with FL yields a decentralized training system in which the compression of the trained models induces only a minimum distortion.
arXiv Detail & Related papers (2020-06-05T07:10:22Z) - Lottery Hypothesis based Unsupervised Pre-training for Model Compression
in Federated Learning [8.90077503980675]
Federated learning (FL) enables a neural network (NN) to be trained using privacy-sensitive data on mobile devices.
This paper proposes a novel unsupervised pre-training method adapted for FL.
It aims to reduce both the communication and computation costs through model compression.
arXiv Detail & Related papers (2020-04-21T08:31:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.