Cross-domain Speech Recognition with Unsupervised Character-level
Distribution Matching
- URL: http://arxiv.org/abs/2104.07491v2
- Date: Fri, 16 Apr 2021 02:10:45 GMT
- Title: Cross-domain Speech Recognition with Unsupervised Character-level
Distribution Matching
- Authors: Wenxin Hou, Jindong Wang, Xu Tan, Tao Qin, Takahiro Shinozaki
- Abstract summary: We propose CMatch, a Character-level distribution matching method to perform fine-grained adaptation between each character in two domains.
Experiments on the Libri-Adapt dataset show that our proposed approach achieves 14.39% and 16.50% relative Word Error Rate (WER) reduction on both cross-device and cross-environment ASR.
- Score: 60.8427677151492
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: End-to-end automatic speech recognition (ASR) can achieve promising
performance with large-scale training data. However, it is known that domain
mismatch between training and testing data often leads to a degradation of
recognition accuracy. In this work, we focus on the unsupervised domain
adaptation for ASR and propose CMatch, a Character-level distribution matching
method to perform fine-grained adaptation between each character in two
domains. First, to obtain labels for the features belonging to each character,
we achieve frame-level label assignment using the Connectionist Temporal
Classification (CTC) pseudo labels. Then, we match the character-level
distributions using Maximum Mean Discrepancy. We train our algorithm using the
self-training technique. Experiments on the Libri-Adapt dataset show that our
proposed approach achieves 14.39% and 16.50% relative Word Error Rate (WER)
reduction on both cross-device and cross-environment ASR. We also
comprehensively analyze the different strategies for frame-level label
assignment and Transformer adaptations.
Related papers
- JointMatch: A Unified Approach for Diverse and Collaborative
Pseudo-Labeling to Semi-Supervised Text Classification [65.268245109828]
Semi-supervised text classification (SSTC) has gained increasing attention due to its ability to leverage unlabeled data.
Existing approaches based on pseudo-labeling suffer from the issues of pseudo-label bias and error accumulation.
We propose JointMatch, a holistic approach for SSTC that addresses these challenges by unifying ideas from recent semi-supervised learning.
arXiv Detail & Related papers (2023-10-23T05:43:35Z) - Drawing the Same Bounding Box Twice? Coping Noisy Annotations in Object
Detection with Repeated Labels [6.872072177648135]
We propose a novel localization algorithm that adapts well-established ground truth estimation methods.
Our algorithm also shows superior performance during training on the TexBiG dataset.
arXiv Detail & Related papers (2023-09-18T13:08:44Z) - Cycle Label-Consistent Networks for Unsupervised Domain Adaptation [57.29464116557734]
Domain adaptation aims to leverage a labeled source domain to learn a classifier for the unlabeled target domain with a different distribution.
We propose a simple yet efficient domain adaptation method, i.e. Cycle Label-Consistent Network (CLCN), by exploiting the cycle consistency of classification label.
We demonstrate the effectiveness of our approach on MNIST-USPS-SVHN, Office-31, Office-Home and Image CLEF-DA benchmarks.
arXiv Detail & Related papers (2022-05-27T13:09:08Z) - Contrastive Learning and Self-Training for Unsupervised Domain
Adaptation in Semantic Segmentation [71.77083272602525]
UDA attempts to provide efficient knowledge transfer from a labeled source domain to an unlabeled target domain.
We propose a contrastive learning approach that adapts category-wise centroids across domains.
We extend our method with self-training, where we use a memory-efficient temporal ensemble to generate consistent and reliable pseudo-labels.
arXiv Detail & Related papers (2021-05-05T11:55:53Z) - Dual-Refinement: Joint Label and Feature Refinement for Unsupervised
Domain Adaptive Person Re-Identification [51.98150752331922]
Unsupervised domain adaptive (UDA) person re-identification (re-ID) is a challenging task due to the missing of labels for the target domain data.
We propose a novel approach, called Dual-Refinement, that jointly refines pseudo labels at the off-line clustering phase and features at the on-line training phase.
Our method outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-12-26T07:35:35Z) - Joint Visual and Temporal Consistency for Unsupervised Domain Adaptive
Person Re-Identification [64.37745443119942]
This paper jointly enforces visual and temporal consistency in the combination of a local one-hot classification and a global multi-class classification.
Experimental results on three large-scale ReID datasets demonstrate the superiority of proposed method in both unsupervised and unsupervised domain adaptive ReID tasks.
arXiv Detail & Related papers (2020-07-21T14:31:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.