ILASR: Privacy-Preserving Incremental Learning for AutomaticSpeech
Recognition at Production Scale
- URL: http://arxiv.org/abs/2207.09078v1
- Date: Tue, 19 Jul 2022 05:24:13 GMT
- Title: ILASR: Privacy-Preserving Incremental Learning for AutomaticSpeech
Recognition at Production Scale
- Authors: Gopinath Chennupati, Milind Rao, Gurpreet Chadha, Aaron Eakin, Anirudh
Raju, Gautam Tiwari, Anit Kumar Sahu, Ariya Rastrow, Jasha Droppo, Andy
Oberlin, Buddha Nandanoor, Prahalad Venkataramanan, Zheng Wu, Pankaj Sitpure
- Abstract summary: This paper uses a cloud based framework for production systems to demonstrate insights from privacy preserving incremental learning for automatic speech recognition (ILASR)
We show that the proposed system can improve the production models significantly(3%) over a new time period of six months even in the absence of human annotated labels.
- Score: 19.524894956258343
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Incremental learning is one paradigm to enable model building and updating at
scale with streaming data. For end-to-end automatic speech recognition (ASR)
tasks, the absence of human annotated labels along with the need for privacy
preserving policies for model building makes it a daunting challenge. Motivated
by these challenges, in this paper we use a cloud based framework for
production systems to demonstrate insights from privacy preserving incremental
learning for automatic speech recognition (ILASR). By privacy preserving, we
mean, usage of ephemeral data which are not human annotated. This system is a
step forward for production levelASR models for incremental/continual learning
that offers near real-time test-bed for experimentation in the cloud for
end-to-end ASR, while adhering to privacy-preserving policies. We show that the
proposed system can improve the production models significantly(3%) over a new
time period of six months even in the absence of human annotated labels with
varying levels of weak supervision and large batch sizes in incremental
learning. This improvement is 20% over test sets with new words and phrases in
the new time period. We demonstrate the effectiveness of model building in a
privacy-preserving incremental fashion for ASR while further exploring the
utility of having an effective teacher model and use of large batch sizes.
Related papers
- Self-Improvement in Language Models: The Sharpening Mechanism [70.9248553790022]
We offer a new perspective on the capabilities of self-improvement through a lens we refer to as sharpening.
Motivated by the observation that language models are often better at verifying response quality than they are at generating correct responses, we formalize self-improvement as using the model itself as a verifier during post-training.
We analyze two natural families of self-improvement algorithms based on SFT and RLHF.
arXiv Detail & Related papers (2024-12-02T20:24:17Z) - Distribution-Level Feature Distancing for Machine Unlearning: Towards a Better Trade-off Between Model Utility and Forgetting [4.220336689294245]
We propose Distribution-Level Feature Distancing (DLFD), a novel method that efficiently forgets instances while preserving task-relevant feature correlations.
Our method synthesizes data samples by optimizing the feature distribution to be distinctly different from that of forget samples, achieving effective results within a single training epoch.
arXiv Detail & Related papers (2024-09-23T06:51:10Z) - Continuously Learning New Words in Automatic Speech Recognition [56.972851337263755]
We propose a self-supervised continual learning approach for Automatic Speech Recognition.
We use a memory-enhanced ASR model from the literature to decode new words from the slides.
We show that with this approach, we obtain increasing performance on the new words when they occur more frequently.
arXiv Detail & Related papers (2024-01-09T10:39:17Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Privacy Adhering Machine Un-learning in NLP [66.17039929803933]
In real world industry use Machine Learning to build models on user data.
Such mandates require effort both in terms of data as well as model retraining.
continuous removal of data and model retraining steps do not scale.
We propose textitMachine Unlearning to tackle this challenge.
arXiv Detail & Related papers (2022-12-19T16:06:45Z) - Exploring Effective Distillation of Self-Supervised Speech Models for
Automatic Speech Recognition [5.802425107635222]
Miniaturization for SSL models has become an important research direction of practical value.
We explore the effective distillation of HuBERT-based SSL models for automatic speech recognition (ASR)
A discriminative loss is introduced for HuBERT to enhance the distillation performance, especially in low-resource scenarios.
arXiv Detail & Related papers (2022-10-27T17:21:14Z) - An Experimental Study on Private Aggregation of Teacher Ensemble
Learning for End-to-End Speech Recognition [51.232523987916636]
Differential privacy (DP) is one data protection avenue to safeguard user information used for training deep models by imposing noisy distortion on privacy data.
In this work, we extend PATE learning to work with dynamic patterns, namely speech, and perform one very first experimental study on ASR to avoid acoustic data leakage.
arXiv Detail & Related papers (2022-10-11T16:55:54Z) - Online Continual Learning of End-to-End Speech Recognition Models [29.931427687979532]
Continual Learning aims to continually learn from new data as it becomes available.
We show that with online continual learning and a selective sampling strategy, we can maintain an accuracy similar to retraining a model from scratch.
arXiv Detail & Related papers (2022-07-11T05:35:06Z) - Improved Speech Emotion Recognition using Transfer Learning and
Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
One of the main challenges in SER is data scarcity.
We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z) - Self-Supervised Learning for Personalized Speech Enhancement [25.05285328404576]
Speech enhancement systems can show improved performance by adapting the model towards a single test-time speaker.
Test-time user might only provide a small amount of noise-free speech data, likely insufficient for traditional fully-supervised learning.
We propose self-supervised methods that are designed specifically to learn personalized and discriminative features from abundant in-the-wild noisy, but still personal speech recordings.
arXiv Detail & Related papers (2021-04-05T17:12:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.