Differentially Private Adapters for Parameter Efficient Acoustic
Modeling
- URL: http://arxiv.org/abs/2305.11360v1
- Date: Fri, 19 May 2023 00:36:43 GMT
- Title: Differentially Private Adapters for Parameter Efficient Acoustic
Modeling
- Authors: Chun-Wei Ho, Chao-Han Huck Yang, Sabato Marco Siniscalchi
- Abstract summary: We introduce a noisy teacher-student ensemble into a conventional adaptation scheme.
We insert residual adapters between layers of the frozen pre-trained acoustic model.
Our solution reduces the number of trainable parameters by 97.5% using the RAs.
- Score: 24.72748979633543
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In this work, we devise a parameter-efficient solution to bring differential
privacy (DP) guarantees into adaptation of a cross-lingual speech classifier.
We investigate a new frozen pre-trained adaptation framework for DP-preserving
speech modeling without full model fine-tuning. First, we introduce a noisy
teacher-student ensemble into a conventional adaptation scheme leveraging a
frozen pre-trained acoustic model and attain superior performance than DP-based
stochastic gradient descent (DPSGD). Next, we insert residual adapters (RA)
between layers of the frozen pre-trained acoustic model. The RAs reduce
training cost and time significantly with a negligible performance drop.
Evaluated on the open-access Multilingual Spoken Words (MLSW) dataset, our
solution reduces the number of trainable parameters by 97.5% using the RAs with
only a 4% performance drop with respect to fine-tuning the cross-lingual speech
classifier while preserving DP guarantees.
Related papers
- Self-supervised Pretraining for Robust Personalized Voice Activity
Detection in Adverse Conditions [0.0]
We pretrain a long short-term memory (LSTM)-encoder using the autoregressive predictive coding framework.
We also propose a denoising variant of APC, with the goal of improving the robustness of personalized VAD.
Our experiments show that self-supervised pretraining not only improves performance in clean conditions, but also yields models which are more robust to adverse conditions.
arXiv Detail & Related papers (2023-12-27T15:36:17Z) - Sparse Low-rank Adaptation of Pre-trained Language Models [79.74094517030035]
We introduce sparse low-rank adaptation (SoRA) that enables dynamic adjustments to the intrinsic rank during the adaptation process.
Our approach strengthens the representation power of LoRA by initializing it with a higher rank, while efficiently taming a temporarily increased number of parameters.
Our experimental results demonstrate that SoRA can outperform other baselines even with 70% retained parameters and 70% training time.
arXiv Detail & Related papers (2023-11-20T11:56:25Z) - Low-rank Adaptation of Large Language Model Rescoring for
Parameter-Efficient Speech Recognition [32.24656612803592]
We propose a neural language modeling system based on low-rank adaptation (LoRA) for speech recognition output rescoring.
We present a method based on low-rank decomposition to train a rescoring BERT model and adapt it to new domains using only a fraction of the pretrained parameters.
The proposed low-rank adaptation Rescore-BERT (LoRB) architecture is evaluated on LibriSpeech and internal datasets with decreased training times by factors between 5.4 and 3.6.
arXiv Detail & Related papers (2023-09-26T19:41:34Z) - Parameter-Efficient Learning for Text-to-Speech Accent Adaptation [58.356667204518985]
This paper presents a parameter-efficient learning (PEL) to develop a low-resource accent adaptation for text-to-speech (TTS)
A resource-efficient adaptation from a frozen pre-trained TTS model is developed by using only 1.2% to 0.8% of original trainable parameters.
Experiment results show that the proposed methods can achieve competitive naturalness with parameter-efficient decoder fine-tuning.
arXiv Detail & Related papers (2023-05-18T22:02:59Z) - Prompt Tuning of Deep Neural Networks for Speaker-adaptive Visual Speech Recognition [66.94463981654216]
We propose prompt tuning methods of Deep Neural Networks (DNNs) for speaker-adaptive Visual Speech Recognition (VSR)
We finetune prompts on adaptation data of target speakers instead of modifying the pre-trained model parameters.
The effectiveness of the proposed method is evaluated on both word- and sentence-level VSR databases.
arXiv Detail & Related papers (2023-02-16T06:01:31Z) - An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling
to Differential Privacy Preserving Speech Recognition [51.20130423303659]
We propose an ensemble learning framework with Poisson sub-sampling to train a collection of teacher models to issue some differential privacy (DP) guarantee for training data.
Through boosting under DP, a student model derived from the training data suffers little model degradation from the models trained with no privacy protection.
Our proposed solution leverages upon two mechanisms, namely: (i) a privacy budget amplification via Poisson sub-sampling to train a target prediction model that requires less noise to achieve a same level of privacy budget, and (ii) a combination of the sub-sampling technique and an ensemble teacher-student learning framework.
arXiv Detail & Related papers (2022-10-12T16:34:08Z) - An Experimental Study on Private Aggregation of Teacher Ensemble
Learning for End-to-End Speech Recognition [51.232523987916636]
Differential privacy (DP) is one data protection avenue to safeguard user information used for training deep models by imposing noisy distortion on privacy data.
In this work, we extend PATE learning to work with dynamic patterns, namely speech, and perform one very first experimental study on ASR to avoid acoustic data leakage.
arXiv Detail & Related papers (2022-10-11T16:55:54Z) - A Unified Speaker Adaptation Approach for ASR [37.76683818356052]
We propose a unified speaker adaptation approach consisting of feature adaptation and model adaptation.
For feature adaptation, we employ a speaker-aware persistent memory model which generalizes better to unseen test speakers.
For model adaptation, we use a novel gradual pruning method to adapt to target speakers without changing the model architecture.
arXiv Detail & Related papers (2021-10-16T10:48:52Z) - Bayesian Learning for Deep Neural Network Adaptation [57.70991105736059]
A key task for speech recognition systems is to reduce the mismatch between training and evaluation data that is often attributable to speaker differences.
Model-based speaker adaptation approaches often require sufficient amounts of target speaker data to ensure robustness.
This paper proposes a full Bayesian learning based DNN speaker adaptation framework to model speaker-dependent (SD) parameter uncertainty.
arXiv Detail & Related papers (2020-12-14T12:30:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.