Related papers: Boosting Cross-Domain Speech Recognition with Self-Supervision

Boosting Cross-Domain Speech Recognition with Self-Supervision

URL: http://arxiv.org/abs/2206.09783v2
Date: Sun, 30 Jul 2023 04:58:58 GMT
Title: Boosting Cross-Domain Speech Recognition with Self-Supervision
Authors: Han Zhu, Gaofeng Cheng, Jindong Wang, Wenxin Hou, Pengyuan Zhang, Yonghong Yan
Abstract summary: Cross-domain performance of automatic speech recognition (ASR) could be severely hampered due to mismatch between training and testing distributions. Previous work has shown that self-supervised learning (SSL) or pseudo-labeling (PL) is effective in UDA by exploiting the self-supervisions of unlabeled data. This work presents a systematic UDA framework to fully utilize the unlabeled data with self-supervision in the pre-training and fine-tuning paradigm.
Score: 35.01508881708751
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The cross-domain performance of automatic speech recognition (ASR) could be severely hampered due to the mismatch between training and testing distributions. Since the target domain usually lacks labeled data, and domain shifts exist at acoustic and linguistic levels, it is challenging to perform unsupervised domain adaptation (UDA) for ASR. Previous work has shown that self-supervised learning (SSL) or pseudo-labeling (PL) is effective in UDA by exploiting the self-supervisions of unlabeled data. However, these self-supervisions also face performance degradation in mismatched domain distributions, which previous work fails to address. This work presents a systematic UDA framework to fully utilize the unlabeled data with self-supervision in the pre-training and fine-tuning paradigm. On the one hand, we apply continued pre-training and data replay techniques to mitigate the domain mismatch of the SSL pre-trained model. On the other hand, we propose a domain-adaptive fine-tuning approach based on the PL technique with three unique modifications: Firstly, we design a dual-branch PL method to decrease the sensitivity to the erroneous pseudo-labels; Secondly, we devise an uncertainty-aware confidence filtering strategy to improve pseudo-label correctness; Thirdly, we introduce a two-step PL approach to incorporate target domain linguistic knowledge, thus generating more accurate target domain pseudo-labels. Experimental results on various cross-domain scenarios demonstrate that the proposed approach effectively boosts the cross-domain performance and significantly outperforms previous approaches.

Related papers

Unsupervised Domain Adaptation for 3D LiDAR Semantic Segmentation Using Contrastive Learning and Multi-Model Pseudo Labeling [0.7373617024876725]
Unsupervised contrastive learning at the segment level is used to pre-train a backbone network.<n>A multi-model pseudo-labeling strategy is introduced, utilizing an ensemble of diverse state-of-the-art architectures.<n>Experiments adapting from Semantic KITTI to unlabeled target datasets demonstrate significant improvements in segmentation accuracy.
arXiv Detail & Related papers (2025-07-24T08:21:43Z)
Unsupervised Domain Adaptive Person Search via Dual Self-Calibration [12.158126976694488]
Unsupervised Domain Adaptive (UDA) person search focuses on employing the model trained on a labeled source domain dataset to a target domain dataset without any additional annotations. Most effective UDA person search methods typically utilize the ground truth of the source domain and pseudo-labels derived from clustering. We propose a Dual Self-Calibration (DSCA) framework for UDA person search that effectively eliminates the interference of noisy pseudo-labels.
arXiv Detail & Related papers (2024-12-21T06:54:00Z)
Attentive Continuous Generative Self-training for Unsupervised Domain Adaptive Medical Image Translation [12.080054869408213]
We develop a generative self-training framework for domain adaptive image translation with continuous value prediction and regression objectives. We evaluate our framework on two cross-scanner/center, inter-subject translation tasks, including tagged-to-cine magnetic resonance (MR) image translation and T1-weighted MR-to-fractional anisotropy translation.
arXiv Detail & Related papers (2023-05-23T23:57:44Z)
Joint Attention-Driven Domain Fusion and Noise-Tolerant Learning for Multi-Source Domain Adaptation [2.734665397040629]
Multi-source Unsupervised Domain Adaptation transfers knowledge from multiple source domains with labeled data to an unlabeled target domain. The distribution discrepancy between different domains and the noisy pseudo-labels in the target domain both lead to performance bottlenecks. We propose an approach that integrates Attention-driven Domain fusion and Noise-Tolerant learning (ADNT) to address the two issues mentioned above.
arXiv Detail & Related papers (2022-08-05T01:08:41Z)
Boosting Unsupervised Domain Adaptation with Soft Pseudo-label and Curriculum Learning [19.903568227077763]
Unsupervised domain adaptation (UDA) improves classification performance on an unlabeled target domain by leveraging data from a fully labeled source domain. We propose a model-agnostic two-stage learning framework, which greatly reduces flawed model predictions using soft pseudo-label strategy. At the second stage, we propose a curriculum learning strategy to adaptively control the weighting between losses from the two domains.
arXiv Detail & Related papers (2021-12-03T14:47:32Z)
Joint Distribution Alignment via Adversarial Learning for Domain Adaptive Object Detection [11.262560426527818]
Unsupervised domain adaptive object detection aims to adapt a well-trained detector from its original source domain with rich labeled data to a new target domain with unlabeled data. Recently, mainstream approaches perform this task through adversarial learning, yet still suffer from two limitations. We propose a joint adaptive detection framework (JADF) to address the above challenges.
arXiv Detail & Related papers (2021-09-19T00:27:08Z)
Stagewise Unsupervised Domain Adaptation with Adversarial Self-Training for Road Segmentation of Remote Sensing Images [93.50240389540252]
Road segmentation from remote sensing images is a challenging task with wide ranges of application potentials. We propose a novel stagewise domain adaptation model called RoadDA to address the domain shift (DS) issue in this field. Experiment results on two benchmarks demonstrate that RoadDA can efficiently reduce the domain gap and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-08-28T09:29:14Z)
Contrastive Learning and Self-Training for Unsupervised Domain Adaptation in Semantic Segmentation [71.77083272602525]
UDA attempts to provide efficient knowledge transfer from a labeled source domain to an unlabeled target domain. We propose a contrastive learning approach that adapts category-wise centroids across domains. We extend our method with self-training, where we use a memory-efficient temporal ensemble to generate consistent and reliable pseudo-labels.
arXiv Detail & Related papers (2021-05-05T11:55:53Z)
Cycle Self-Training for Domain Adaptation [85.14659717421533]
Cycle Self-Training (CST) is a principled self-training algorithm that enforces pseudo-labels to generalize across domains. CST recovers target ground truth, while both invariant feature learning and vanilla self-training fail. Empirical results indicate that CST significantly improves over prior state-of-the-arts in standard UDA benchmarks.
arXiv Detail & Related papers (2021-03-05T10:04:25Z)
Selective Pseudo-Labeling with Reinforcement Learning for Semi-Supervised Domain Adaptation [116.48885692054724]
We propose a reinforcement learning based selective pseudo-labeling method for semi-supervised domain adaptation. We develop a deep Q-learning model to select both accurate and representative pseudo-labeled instances. Our proposed method is evaluated on several benchmark datasets for SSDA, and demonstrates superior performance to all the comparison methods.
arXiv Detail & Related papers (2020-12-07T03:37:38Z)
Effective Label Propagation for Discriminative Semi-Supervised Domain Adaptation [76.41664929948607]
Semi-supervised domain adaptation (SSDA) methods have demonstrated great potential in large-scale image classification tasks. We present a novel and effective method to tackle this problem by using effective inter-domain and intra-domain semantic information propagation. Our source code and pre-trained models will be released soon.
arXiv Detail & Related papers (2020-12-04T14:28:19Z)
Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training [55.824641135682725]
Domain adaptation experiments using WSJ as a source domain and TED-LIUM 3 as well as SWITCHBOARD show that up to 80% of the performance of a system trained on ground-truth data can be recovered.
arXiv Detail & Related papers (2020-11-26T18:51:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.