Boosting Cross-Domain Speech Recognition with Self-Supervision
- URL: http://arxiv.org/abs/2206.09783v2
- Date: Sun, 30 Jul 2023 04:58:58 GMT
- Title: Boosting Cross-Domain Speech Recognition with Self-Supervision
- Authors: Han Zhu, Gaofeng Cheng, Jindong Wang, Wenxin Hou, Pengyuan Zhang,
Yonghong Yan
- Abstract summary: Cross-domain performance of automatic speech recognition (ASR) could be severely hampered due to mismatch between training and testing distributions.
Previous work has shown that self-supervised learning (SSL) or pseudo-labeling (PL) is effective in UDA by exploiting the self-supervisions of unlabeled data.
This work presents a systematic UDA framework to fully utilize the unlabeled data with self-supervision in the pre-training and fine-tuning paradigm.
- Score: 35.01508881708751
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The cross-domain performance of automatic speech recognition (ASR) could be
severely hampered due to the mismatch between training and testing
distributions. Since the target domain usually lacks labeled data, and domain
shifts exist at acoustic and linguistic levels, it is challenging to perform
unsupervised domain adaptation (UDA) for ASR. Previous work has shown that
self-supervised learning (SSL) or pseudo-labeling (PL) is effective in UDA by
exploiting the self-supervisions of unlabeled data. However, these
self-supervisions also face performance degradation in mismatched domain
distributions, which previous work fails to address. This work presents a
systematic UDA framework to fully utilize the unlabeled data with
self-supervision in the pre-training and fine-tuning paradigm. On the one hand,
we apply continued pre-training and data replay techniques to mitigate the
domain mismatch of the SSL pre-trained model. On the other hand, we propose a
domain-adaptive fine-tuning approach based on the PL technique with three
unique modifications: Firstly, we design a dual-branch PL method to decrease
the sensitivity to the erroneous pseudo-labels; Secondly, we devise an
uncertainty-aware confidence filtering strategy to improve pseudo-label
correctness; Thirdly, we introduce a two-step PL approach to incorporate target
domain linguistic knowledge, thus generating more accurate target domain
pseudo-labels. Experimental results on various cross-domain scenarios
demonstrate that the proposed approach effectively boosts the cross-domain
performance and significantly outperforms previous approaches.
Related papers
- Attentive Continuous Generative Self-training for Unsupervised Domain
Adaptive Medical Image Translation [12.080054869408213]
We develop a generative self-training framework for domain adaptive image translation with continuous value prediction and regression objectives.
We evaluate our framework on two cross-scanner/center, inter-subject translation tasks, including tagged-to-cine magnetic resonance (MR) image translation and T1-weighted MR-to-fractional anisotropy translation.
arXiv Detail & Related papers (2023-05-23T23:57:44Z) - Joint Attention-Driven Domain Fusion and Noise-Tolerant Learning for
Multi-Source Domain Adaptation [2.734665397040629]
Multi-source Unsupervised Domain Adaptation transfers knowledge from multiple source domains with labeled data to an unlabeled target domain.
The distribution discrepancy between different domains and the noisy pseudo-labels in the target domain both lead to performance bottlenecks.
We propose an approach that integrates Attention-driven Domain fusion and Noise-Tolerant learning (ADNT) to address the two issues mentioned above.
arXiv Detail & Related papers (2022-08-05T01:08:41Z) - Boosting Unsupervised Domain Adaptation with Soft Pseudo-label and
Curriculum Learning [19.903568227077763]
Unsupervised domain adaptation (UDA) improves classification performance on an unlabeled target domain by leveraging data from a fully labeled source domain.
We propose a model-agnostic two-stage learning framework, which greatly reduces flawed model predictions using soft pseudo-label strategy.
At the second stage, we propose a curriculum learning strategy to adaptively control the weighting between losses from the two domains.
arXiv Detail & Related papers (2021-12-03T14:47:32Z) - Joint Distribution Alignment via Adversarial Learning for Domain
Adaptive Object Detection [11.262560426527818]
Unsupervised domain adaptive object detection aims to adapt a well-trained detector from its original source domain with rich labeled data to a new target domain with unlabeled data.
Recently, mainstream approaches perform this task through adversarial learning, yet still suffer from two limitations.
We propose a joint adaptive detection framework (JADF) to address the above challenges.
arXiv Detail & Related papers (2021-09-19T00:27:08Z) - Stagewise Unsupervised Domain Adaptation with Adversarial Self-Training
for Road Segmentation of Remote Sensing Images [93.50240389540252]
Road segmentation from remote sensing images is a challenging task with wide ranges of application potentials.
We propose a novel stagewise domain adaptation model called RoadDA to address the domain shift (DS) issue in this field.
Experiment results on two benchmarks demonstrate that RoadDA can efficiently reduce the domain gap and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-08-28T09:29:14Z) - Contrastive Learning and Self-Training for Unsupervised Domain
Adaptation in Semantic Segmentation [71.77083272602525]
UDA attempts to provide efficient knowledge transfer from a labeled source domain to an unlabeled target domain.
We propose a contrastive learning approach that adapts category-wise centroids across domains.
We extend our method with self-training, where we use a memory-efficient temporal ensemble to generate consistent and reliable pseudo-labels.
arXiv Detail & Related papers (2021-05-05T11:55:53Z) - Cycle Self-Training for Domain Adaptation [85.14659717421533]
Cycle Self-Training (CST) is a principled self-training algorithm that enforces pseudo-labels to generalize across domains.
CST recovers target ground truth, while both invariant feature learning and vanilla self-training fail.
Empirical results indicate that CST significantly improves over prior state-of-the-arts in standard UDA benchmarks.
arXiv Detail & Related papers (2021-03-05T10:04:25Z) - Selective Pseudo-Labeling with Reinforcement Learning for
Semi-Supervised Domain Adaptation [116.48885692054724]
We propose a reinforcement learning based selective pseudo-labeling method for semi-supervised domain adaptation.
We develop a deep Q-learning model to select both accurate and representative pseudo-labeled instances.
Our proposed method is evaluated on several benchmark datasets for SSDA, and demonstrates superior performance to all the comparison methods.
arXiv Detail & Related papers (2020-12-07T03:37:38Z) - Effective Label Propagation for Discriminative Semi-Supervised Domain
Adaptation [76.41664929948607]
Semi-supervised domain adaptation (SSDA) methods have demonstrated great potential in large-scale image classification tasks.
We present a novel and effective method to tackle this problem by using effective inter-domain and intra-domain semantic information propagation.
Our source code and pre-trained models will be released soon.
arXiv Detail & Related papers (2020-12-04T14:28:19Z) - Unsupervised Domain Adaptation for Speech Recognition via Uncertainty
Driven Self-Training [55.824641135682725]
Domain adaptation experiments using WSJ as a source domain and TED-LIUM 3 as well as SWITCHBOARD show that up to 80% of the performance of a system trained on ground-truth data can be recovered.
arXiv Detail & Related papers (2020-11-26T18:51:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.