Federated Learning with Differential Privacy for End-to-End Speech
Recognition
- URL: http://arxiv.org/abs/2310.00098v1
- Date: Fri, 29 Sep 2023 19:11:49 GMT
- Title: Federated Learning with Differential Privacy for End-to-End Speech
Recognition
- Authors: Martin Pelikan, Sheikh Shams Azam, Vitaly Feldman, Jan "Honza"
Silovsky, Kunal Talwar, Tatiana Likhomanenko
- Abstract summary: Federated learning (FL) has emerged as a promising approach to train machine learning models.
We apply differential privacy (DP) to FL for automatic speech recognition (ASR)
We achieve user-level ($7.2$, $10-9$)-$textbfDP$ (resp. ($4.5$, $10-9$)-$textbfDP$ with a 1.3% (resp. 4.6%) absolute drop in the word error rate for extrapolation to high (resp. low) population scale for $textbfFL with DP in ASR
- Score: 41.53948098243563
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While federated learning (FL) has recently emerged as a promising approach to
train machine learning models, it is limited to only preliminary explorations
in the domain of automatic speech recognition (ASR). Moreover, FL does not
inherently guarantee user privacy and requires the use of differential privacy
(DP) for robust privacy guarantees. However, we are not aware of prior work on
applying DP to FL for ASR. In this paper, we aim to bridge this research gap by
formulating an ASR benchmark for FL with DP and establishing the first
baselines. First, we extend the existing research on FL for ASR by exploring
different aspects of recent $\textit{large end-to-end transformer models}$:
architecture design, seed models, data heterogeneity, domain shift, and impact
of cohort size. With a $\textit{practical}$ number of central aggregations we
are able to train $\textbf{FL models}$ that are \textbf{nearly optimal} even
with heterogeneous data, a seed model from another domain, or no pre-trained
seed model. Second, we apply DP to FL for ASR, which is non-trivial since DP
noise severely affects model training, especially for large transformer models,
due to highly imbalanced gradients in the attention block. We counteract the
adverse effect of DP noise by reviving per-layer clipping and explaining why
its effect is more apparent in our case than in the prior work. Remarkably, we
achieve user-level ($7.2$, $10^{-9}$)-$\textbf{DP}$ (resp. ($4.5$,
$10^{-9}$)-$\textbf{DP}$) with a 1.3% (resp. 4.6%) absolute drop in the word
error rate for extrapolation to high (resp. low) population scale for
$\textbf{FL with DP in ASR}$.
Related papers
- Monge-Ampere Regularization for Learning Arbitrary Shapes from Point Clouds [69.69726932986923]
We propose the scaled-squared distance function (S$2$DF), a novel implicit surface representation for modeling arbitrary surface types.
S$2$DF does not distinguish between inside and outside regions while effectively addressing the non-differentiability issue of UDF at the zero level set.
We demonstrate that S$2$DF satisfies a second-order partial differential equation of Monge-Ampere-type.
arXiv Detail & Related papers (2024-10-24T06:56:34Z) - DMM: Distributed Matrix Mechanism for Differentially-Private Federated Learning using Packed Secret Sharing [51.336015600778396]
Federated Learning (FL) has gained lots of traction recently, both in industry and academia.
In FL, a machine learning model is trained using data from various end-users arranged in committees across several rounds.
Since such data can often be sensitive, a primary challenge in FL is providing privacy while still retaining utility of the model.
arXiv Detail & Related papers (2024-10-21T16:25:14Z) - DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation [15.023077875990614]
Federated learning (FL) allows clients to collaboratively train a global model without sharing their local data with a server.
Differential privacy (DP) addresses such leakage by providing formal privacy guarantees, with mechanisms that add randomness to the clients' contributions.
We propose an adaptation method that can be combined with differential privacy and call it DP-DyLoRA.
arXiv Detail & Related papers (2024-05-10T10:10:37Z) - Improved Algorithm for Adversarial Linear Mixture MDPs with Bandit
Feedback and Unknown Transition [71.33787410075577]
We study reinforcement learning with linear function approximation, unknown transition, and adversarial losses.
We propose a new algorithm that attains an $widetildeO(dsqrtHS3K + sqrtHSAK)$ regret with high probability.
arXiv Detail & Related papers (2024-03-07T15:03:50Z) - Differentially Private Reward Estimation with Preference Feedback [15.943664678210146]
Learning from preference-based feedback has recently gained considerable traction as a promising approach to align generative models with human interests.
An adversarial attack in any step of the above pipeline might reveal private and sensitive information of human labelers.
We focus on the problem of reward estimation from preference-based feedback while protecting privacy of each individual labelers.
arXiv Detail & Related papers (2023-10-30T16:58:30Z) - Differentially Private Deep Learning with ModelMix [14.445182641912014]
We propose a generic optimization framework, called em ModelMix, which performs random aggregation of intermediate model states.
It strengthens the composite privacy analysis utilizing the entropy of the training trajectory.
We present a formal study on the effect of gradient clipping in Differentially Private Gradient Descent.
arXiv Detail & Related papers (2022-10-07T22:59:00Z) - The Fundamental Price of Secure Aggregation in Differentially Private
Federated Learning [34.630300910399036]
We characterize the fundamental communication cost required to obtain the best accuracy under $varepsilon$ central DP.
Our results show that $tildeOleft( min(n2varepsilon2, d) right)$ bits per client are both sufficient and necessary.
This provides a significant improvement relative to state-of-the-art SecAgg distributed DP schemes.
arXiv Detail & Related papers (2022-03-07T22:56:09Z) - Differentially Private Exploration in Reinforcement Learning with Linear
Representation [102.17246636801649]
We first consider the setting of linear-mixture MDPs (Ayoub et al., 2020) (a.k.a. model-based setting) and provide an unified framework for analyzing joint and local differential private (DP) exploration.
We further study privacy-preserving exploration in linear MDPs (Jin et al., 2020) (a.k.a. model-free setting) where we provide a $widetildeO(sqrtK/epsilon)$ regret bound for $(epsilon,delta)
arXiv Detail & Related papers (2021-12-02T19:59:50Z) - FeO2: Federated Learning with Opt-Out Differential Privacy [34.08435990347253]
Federated learning (FL) is an emerging privacy-preserving paradigm, where a global model is trained at a central server while keeping client data local.
Differential privacy (DP) can be employed to provide privacy guarantees within FL, typically at the cost of degraded final trained model.
We propose a new algorithm for federated learning with opt-out DP, referred to as emphFeO2, along with a discussion on its advantages compared to the baselines of private and personalized FL algorithms.
arXiv Detail & Related papers (2021-10-28T16:08:18Z) - On the Practicality of Differential Privacy in Federated Learning by
Tuning Iteration Times [51.61278695776151]
Federated Learning (FL) is well known for its privacy protection when training machine learning models among distributed clients collaboratively.
Recent studies have pointed out that the naive FL is susceptible to gradient leakage attacks.
Differential Privacy (DP) emerges as a promising countermeasure to defend against gradient leakage attacks.
arXiv Detail & Related papers (2021-01-11T19:43:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.