Incremental Learning for End-to-End Automatic Speech Recognition
- URL: http://arxiv.org/abs/2005.04288v3
- Date: Thu, 16 Sep 2021 02:47:57 GMT
- Title: Incremental Learning for End-to-End Automatic Speech Recognition
- Authors: Li Fu, Xiaoxiao Li, Libo Zi, Zhengchen Zhang, Youzheng Wu, Xiaodong
He, Bowen Zhou
- Abstract summary: We propose an incremental learning method for end-to-end Automatic Speech Recognition (ASR)
We design a novel explainability-based knowledge distillation for ASR models, which is combined with a response-based knowledge distillation to maintain the original model's predictions and the "reason" for the predictions.
Results on a multi-stage sequential training task show that our method outperforms existing ones in mitigating forgetting.
- Score: 41.297106772785206
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose an incremental learning method for end-to-end
Automatic Speech Recognition (ASR) which enables an ASR system to perform well
on new tasks while maintaining the performance on its originally learned ones.
To mitigate catastrophic forgetting during incremental learning, we design a
novel explainability-based knowledge distillation for ASR models, which is
combined with a response-based knowledge distillation to maintain the original
model's predictions and the "reason" for the predictions. Our method works
without access to the training data of original tasks, which addresses the
cases where the previous data is no longer available or joint training is
costly. Results on a multi-stage sequential training task show that our method
outperforms existing ones in mitigating forgetting. Furthermore, in two
practical scenarios, compared to the target-reference joint training method,
the performance drop of our method is 0.02% Character Error Rate (CER), which
is 97% smaller than the drops of the baseline methods.
Related papers
- Adaptive Rentention & Correction for Continual Learning [114.5656325514408]
A common problem in continual learning is the classification layer's bias towards the most recent task.
We name our approach Adaptive Retention & Correction (ARC)
ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets.
arXiv Detail & Related papers (2024-05-23T08:43:09Z) - EsaCL: Efficient Continual Learning of Sparse Models [10.227171407348326]
Key challenge in the continual learning setting is to efficiently learn a sequence of tasks without forgetting how to perform previously learned tasks.
We propose a new method for efficient continual learning of sparse models (EsaCL) that can automatically prune redundant parameters without adversely impacting the model's predictive power.
arXiv Detail & Related papers (2024-01-11T04:59:44Z) - Class Incremental Learning for Adversarial Robustness [17.06592851567578]
Adrial training integrates adversarial examples during model training to enhance robustness.
We observe that combining incremental learning with naive adversarial training easily leads to a loss of robustness.
We propose the Flatness Preserving Distillation (FPD) loss that leverages the output difference between adversarial and clean examples.
arXiv Detail & Related papers (2023-12-06T04:38:02Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Weight Averaging: A Simple Yet Effective Method to Overcome Catastrophic
Forgetting in Automatic Speech Recognition [10.69755597323435]
Adapting a trained Automatic Speech Recognition model to new tasks results in catastrophic forgetting of old tasks.
We propose a simple yet effective method to overcome catastrophic forgetting: weight averaging.
We illustrate the effectiveness of our method on both monolingual and multilingual ASR.
arXiv Detail & Related papers (2022-10-27T09:31:37Z) - SURF: Semi-supervised Reward Learning with Data Augmentation for
Feedback-efficient Preference-based Reinforcement Learning [168.89470249446023]
We present SURF, a semi-supervised reward learning framework that utilizes a large amount of unlabeled samples with data augmentation.
In order to leverage unlabeled samples for reward learning, we infer pseudo-labels of the unlabeled samples based on the confidence of the preference predictor.
Our experiments demonstrate that our approach significantly improves the feedback-efficiency of the preference-based method on a variety of locomotion and robotic manipulation tasks.
arXiv Detail & Related papers (2022-03-18T16:50:38Z) - Incremental Embedding Learning via Zero-Shot Translation [65.94349068508863]
Current state-of-the-art incremental learning methods tackle catastrophic forgetting problem in traditional classification networks.
We propose a novel class-incremental method for embedding network, named as zero-shot translation class-incremental method (ZSTCI)
In addition, ZSTCI can easily be combined with existing regularization-based incremental learning methods to further improve performance of embedding networks.
arXiv Detail & Related papers (2020-12-31T08:21:37Z) - Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less
Forgetting [66.45372974713189]
We propose a recall and learn mechanism, which adopts the idea of multi-task learning and jointly learns pretraining tasks and downstream tasks.
Experiments show that our method achieves state-of-the-art performance on the GLUE benchmark.
We provide open-source RecAdam, which integrates the proposed mechanisms into Adam to facility the NLP community.
arXiv Detail & Related papers (2020-04-27T08:59:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.