An Experimental Study on Private Aggregation of Teacher Ensemble
Learning for End-to-End Speech Recognition
- URL: http://arxiv.org/abs/2210.05614v1
- Date: Tue, 11 Oct 2022 16:55:54 GMT
- Title: An Experimental Study on Private Aggregation of Teacher Ensemble
Learning for End-to-End Speech Recognition
- Authors: Chao-Han Huck Yang, I-Fan Chen, Andreas Stolcke, Sabato Marco
Siniscalchi, Chin-Hui Lee
- Abstract summary: Differential privacy (DP) is one data protection avenue to safeguard user information used for training deep models by imposing noisy distortion on privacy data.
In this work, we extend PATE learning to work with dynamic patterns, namely speech, and perform one very first experimental study on ASR to avoid acoustic data leakage.
- Score: 51.232523987916636
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Differential privacy (DP) is one data protection avenue to safeguard user
information used for training deep models by imposing noisy distortion on
privacy data. Such a noise perturbation often results in a severe performance
degradation in automatic speech recognition (ASR) in order to meet a privacy
budget $\varepsilon$. Private aggregation of teacher ensemble (PATE) utilizes
ensemble probabilities to improve ASR accuracy when dealing with the noise
effects controlled by small values of $\varepsilon$. In this work, we extend
PATE learning to work with dynamic patterns, namely speech, and perform one
very first experimental study on ASR to avoid acoustic data leakage. We
evaluate three end-to-end deep models, including LAS, hybrid attention/CTC, and
RNN transducer, on the open-source LibriSpeech and TIMIT corpora. PATE
learning-enhanced ASR models outperform the benchmark DP-SGD mechanisms,
especially under strict DP budgets, giving relative word error rate reductions
between 26.2% and 27.5% for RNN transducer model evaluated with LibriSpeech. We
also introduce another DP-preserving ASR solution with public speech corpus
pre-training.
Related papers
- Training Large ASR Encoders with Differential Privacy [18.624449993983106]
Self-supervised learning (SSL) methods for large speech models have proven to be highly effective at ASR.
With the interest in public deployment of large pre-trained models, there is a rising concern for unintended memorization and leakage of sensitive data points from the training data.
This paper is the first to apply differentially private (DP) pre-training to a SOTA Conformer-based encoder, and study its performance on a downstream ASR task assuming the fine-tuning data is public.
arXiv Detail & Related papers (2024-09-21T00:01:49Z) - Differentially Private Adapters for Parameter Efficient Acoustic
Modeling [24.72748979633543]
We introduce a noisy teacher-student ensemble into a conventional adaptation scheme.
We insert residual adapters between layers of the frozen pre-trained acoustic model.
Our solution reduces the number of trainable parameters by 97.5% using the RAs.
arXiv Detail & Related papers (2023-05-19T00:36:43Z) - An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling
to Differential Privacy Preserving Speech Recognition [51.20130423303659]
We propose an ensemble learning framework with Poisson sub-sampling to train a collection of teacher models to issue some differential privacy (DP) guarantee for training data.
Through boosting under DP, a student model derived from the training data suffers little model degradation from the models trained with no privacy protection.
Our proposed solution leverages upon two mechanisms, namely: (i) a privacy budget amplification via Poisson sub-sampling to train a target prediction model that requires less noise to achieve a same level of privacy budget, and (ii) a combination of the sub-sampling technique and an ensemble teacher-student learning framework.
arXiv Detail & Related papers (2022-10-12T16:34:08Z) - Improving Noise Robustness of Contrastive Speech Representation Learning
with Speech Reconstruction [109.44933866397123]
Noise robustness is essential for deploying automatic speech recognition systems in real-world environments.
We employ a noise-robust representation learned by a refined self-supervised framework for noisy speech recognition.
We achieve comparable performance to the best supervised approach reported with only 16% of labeled data.
arXiv Detail & Related papers (2021-10-28T20:39:02Z) - Personalized Speech Enhancement through Self-Supervised Data
Augmentation and Purification [24.596224536399326]
We train an SNR predictor model to estimate the frame-by-frame SNR of the pseudo-sources.
We empirically show that the proposed data purification step improves the usability of the speaker-specific noisy data.
arXiv Detail & Related papers (2021-04-05T17:17:55Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z) - Characterizing Speech Adversarial Examples Using Self-Attention U-Net
Enhancement [102.48582597586233]
We present a U-Net based attention model, U-Net$_At$, to enhance adversarial speech signals.
We conduct experiments on the automatic speech recognition (ASR) task with adversarial audio attacks.
arXiv Detail & Related papers (2020-03-31T02:16:34Z) - Joint Contextual Modeling for ASR Correction and Language Understanding [60.230013453699975]
We propose multi-task neural approaches to perform contextual language correction on ASR outputs jointly with language understanding (LU)
We show that the error rates of off the shelf ASR and following LU systems can be reduced significantly by 14% relative with joint models trained using small amounts of in-domain data.
arXiv Detail & Related papers (2020-01-28T22:09:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.