Damage Control During Domain Adaptation for Transducer Based Automatic
Speech Recognition
- URL: http://arxiv.org/abs/2210.03255v1
- Date: Thu, 6 Oct 2022 23:38:50 GMT
- Title: Damage Control During Domain Adaptation for Transducer Based Automatic
Speech Recognition
- Authors: Somshubra Majumdar, Shantanu Acharya, Vitaly Lavrukhin, Boris Ginsburg
- Abstract summary: A potential drawback of model adaptation to new domains is catastrophic forgetting, where the Word Error Rate on the original domain is significantly degraded.
This paper addresses the situation when we want to simultaneously adapt automatic speech recognition models to a new domain.
We propose several techniques such as a limited training strategy and regularized adapter modules for the Transducer encoder, prediction, and joiner network.
- Score: 13.029537136528521
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic speech recognition models are often adapted to improve their
accuracy in a new domain. A potential drawback of model adaptation to new
domains is catastrophic forgetting, where the Word Error Rate on the original
domain is significantly degraded. This paper addresses the situation when we
want to simultaneously adapt automatic speech recognition models to a new
domain and limit the degradation of accuracy on the original domain without
access to the original training dataset. We propose several techniques such as
a limited training strategy and regularized adapter modules for the Transducer
encoder, prediction, and joiner network. We apply these methods to the Google
Speech Commands and to the UK and Ireland English Dialect speech data set and
obtain strong results on the new target domain while limiting the degradation
on the original domain.
Related papers
- Progressive Conservative Adaptation for Evolving Target Domains [76.9274842289221]
Conventional domain adaptation typically transfers knowledge from a source domain to a stationary target domain.
Restoring and adapting to such target data results in escalating computational and resource consumption over time.
We propose a simple yet effective approach, termed progressive conservative adaptation (PCAda)
arXiv Detail & Related papers (2024-02-07T04:11:25Z) - Automatic Data Augmentation for Domain Adapted Fine-Tuning of
Self-Supervised Speech Representations [21.423349835589793]
Self-Supervised Learning (SSL) has allowed leveraging large amounts of unlabeled speech data to improve the performance of speech recognition models.
Despite this, speech SSL representations may fail while facing an acoustic mismatch between the pretraining and target datasets.
We propose a novel supervised domain adaptation method, designed for cases exhibiting such a mismatch in acoustic domains.
arXiv Detail & Related papers (2023-06-01T09:30:49Z) - SwitchPrompt: Learning Domain-Specific Gated Soft Prompts for
Classification in Low-Resource Domains [14.096170976149521]
SwitchPrompt is a novel and lightweight prompting methodology for adaptation of language models trained on datasets from the general domain to diverse low-resource domains.
Our few-shot experiments on three text classification benchmarks demonstrate the efficacy of the general-domain pre-trained language models when used with SwitchPrompt.
They often even outperform their domain-specific counterparts trained with baseline state-of-the-art prompting methods by up to 10.7% performance increase in accuracy.
arXiv Detail & Related papers (2023-02-14T07:14:08Z) - Domain Adaptation via Prompt Learning [39.97105851723885]
Unsupervised domain adaption (UDA) aims to adapt models learned from a well-annotated source domain to a target domain.
We introduce a novel prompt learning paradigm for UDA, named Domain Adaptation via Prompt Learning (DAPL)
arXiv Detail & Related papers (2022-02-14T13:25:46Z) - Non-Parametric Unsupervised Domain Adaptation for Neural Machine
Translation [61.27321597981737]
$k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor retrieval.
We propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval.
arXiv Detail & Related papers (2021-09-14T11:50:01Z) - Stagewise Unsupervised Domain Adaptation with Adversarial Self-Training
for Road Segmentation of Remote Sensing Images [93.50240389540252]
Road segmentation from remote sensing images is a challenging task with wide ranges of application potentials.
We propose a novel stagewise domain adaptation model called RoadDA to address the domain shift (DS) issue in this field.
Experiment results on two benchmarks demonstrate that RoadDA can efficiently reduce the domain gap and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-08-28T09:29:14Z) - Unsupervised Domain Adaptation in Speech Recognition using Phonetic
Features [6.872447420442981]
We propose a technique to perform unsupervised gender-based domain adaptation in speech recognition using phonetic features.
Experiments are performed on the TIMIT dataset and there is a considerable decrease in the phoneme error rate using the proposed approach.
arXiv Detail & Related papers (2021-08-04T06:22:12Z) - Neural Supervised Domain Adaptation by Augmenting Pre-trained Models
with Random Units [14.183224769428843]
Neural Transfer Learning (TL) is becoming ubiquitous in Natural Language Processing (NLP)
In this paper, we show through interpretation methods that such scheme, despite its efficiency, is suffering from a main limitation.
We propose to augment the pre-trained model with normalised, weighted and randomly initialised units that foster a better adaptation while maintaining the valuable source knowledge.
arXiv Detail & Related papers (2021-06-09T09:29:11Z) - Gradient Regularized Contrastive Learning for Continual Domain
Adaptation [86.02012896014095]
We study the problem of continual domain adaptation, where the model is presented with a labelled source domain and a sequence of unlabelled target domains.
We propose Gradient Regularized Contrastive Learning (GRCL) to solve the obstacles.
Experiments on Digits, DomainNet and Office-Caltech benchmarks demonstrate the strong performance of our approach.
arXiv Detail & Related papers (2021-03-23T04:10:42Z) - DEAAN: Disentangled Embedding and Adversarial Adaptation Network for
Robust Speaker Representation Learning [69.70594547377283]
We propose a novel framework to disentangle speaker-related and domain-specific features.
Our framework can effectively generate more speaker-discriminative and domain-invariant speaker representations.
arXiv Detail & Related papers (2020-12-12T19:46:56Z) - Understanding Self-Training for Gradual Domain Adaptation [107.37869221297687]
We consider gradual domain adaptation, where the goal is to adapt an initial classifier trained on a source domain given only unlabeled data that shifts gradually in distribution towards a target domain.
We prove the first non-vacuous upper bound on the error of self-training with gradual shifts, under settings where directly adapting to the target domain can result in unbounded error.
The theoretical analysis leads to algorithmic insights, highlighting that regularization and label sharpening are essential even when we have infinite data, and suggesting that self-training works particularly well for shifts with small Wasserstein-infinity distance.
arXiv Detail & Related papers (2020-02-26T08:59:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.