Pseudo2Real: Task Arithmetic for Pseudo-Label Correction in Automatic Speech Recognition
- URL: http://arxiv.org/abs/2510.08047v1
- Date: Thu, 09 Oct 2025 10:31:47 GMT
- Title: Pseudo2Real: Task Arithmetic for Pseudo-Label Correction in Automatic Speech Recognition
- Authors: Yi-Cheng Lin, Yu-Hsuan Li Liang, Hsuan Su, Tzu-Quan Lin, Shang-Tse Chen, Yun-Nung Chen, Hung-yi Lee,
- Abstract summary: Real-world systems encounter unseen accents and domains with limited labeled data.<n> pseudo-labeling often introduces systematic, accent-specific errors that filtering fails to fix.<n>We propose a simple parameter-space correction to correct these recurring biases without target ground truth.
- Score: 61.712328155788434
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Robust ASR under domain shift is crucial because real-world systems encounter unseen accents and domains with limited labeled data. Although pseudo-labeling offers a practical workaround, it often introduces systematic, accent-specific errors that filtering fails to fix. We ask: How can we correct these recurring biases without target ground truth? We propose a simple parameter-space correction: in a source domain containing both real and pseudo-labeled data, two ASR models are fine-tuned from the same initialization, one on ground-truth labels and the other on pseudo-labels, and their weight difference forms a correction vector that captures pseudo-label biases. When applied to a pseudo-labeled target model, this vector enhances recognition, achieving up to a 35% relative Word Error Rate (WER) reduction on AfriSpeech-200 across ten African accents with the Whisper tiny model.
Related papers
- ReHear: Iterative Pseudo-Label Refinement for Semi-Supervised Speech Recognition via Audio Large Language Models [12.527207210862151]
ReHear is a framework for iterative pseudo-label refinement in automatic speech recognition.<n>It integrates an instruction-tuned, audio-aware large language model into the self-training loop.<n>We show that ReHear effectively mitigates error propagation, consistently outperforming both supervised and pseudo-labeling baselines.
arXiv Detail & Related papers (2026-02-21T05:04:22Z) - Retrieval-Augmented Self-Taught Reasoning Model with Adaptive Chain-of-Thought for ASR Named Entity Correction [12.483998165719981]
We propose a retrieval-augmented generation framework for correcting named entity errors in automatic speech recognition (ASR)<n>Our approach consists of two key components: (1) a rephrasing language model (RLM) for named entity recognition, followed by candidate retrieval using a phonetic-level edit distance; and (2) a novel self-taught reasoning model with adaptive chain-of-thought (A-STAR) that dynamically adjusts the depth of its reasoning based on task difficulty.
arXiv Detail & Related papers (2026-01-21T15:05:39Z) - Towards Micro-Action Recognition with Limited Annotations: An Asynchronous Pseudo Labeling and Training Approach [35.32024173141412]
We introduce the setting of Semi-Supervised MAR (SSMAR), where only a part of samples are labeled.<n>Traditional Semi-Supervised Learning (SSL) methods tend to overfit on inaccurate pseudo-labels, leading to error accumulation and degraded performance.<n>We propose Asynchronous Pseudo Labeling and Training (APLT), which explicitly separates the pseudo-labeling process from model training.
arXiv Detail & Related papers (2025-04-10T14:22:15Z) - Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition [52.624909026294105]
We propose a non-autoregressive speech error correction method.
A Confidence Module measures the uncertainty of each word of the N-best ASR hypotheses.
The proposed system reduces the error rate by 21% compared with the ASR model.
arXiv Detail & Related papers (2024-06-29T17:56:28Z) - Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech
Recognition [49.42732949233184]
When labeled data is insufficient, semi-supervised learning with the pseudo-labeling technique can significantly improve the performance of automatic speech recognition.
Taking noisy labels as ground-truth in the loss function results in suboptimal performance.
We propose a novel framework named alternative pseudo-labeling to tackle the issue of noisy pseudo-labels.
arXiv Detail & Related papers (2023-08-12T12:13:52Z) - Combating Confirmation Bias: A Unified Pseudo-Labeling Framework for Entity Alignment [30.407534668054286]
We propose a Unified Pseudo-Labeling framework for Entity Alignment (UPL-EA)<n>UPL-EA explicitly eliminates pseudo-labeling errors to boost the accuracy of entity alignment.<n>Our results and in-depth analyses demonstrate the superiority of UPL-EA over 15 competitive baselines.
arXiv Detail & Related papers (2023-07-05T07:32:34Z) - Robust Target Training for Multi-Source Domain Adaptation [110.77704026569499]
We propose a novel Bi-level Optimization based Robust Target Training (BORT$2$) method for MSDA.
Our proposed method achieves the state of the art performance on three MSDA benchmarks, including the large-scale DomainNet dataset.
arXiv Detail & Related papers (2022-10-04T15:20:01Z) - FastCorrect: Fast Error Correction with Edit Alignment for Automatic
Speech Recognition [90.34177266618143]
We propose FastCorrect, a novel NAR error correction model based on edit alignment.
FastCorrect speeds up the inference by 6-9 times and maintains the accuracy (8-14% WER reduction) compared with the autoregressive correction model.
It outperforms the accuracy of popular NAR models adopted in neural machine translation by a large margin.
arXiv Detail & Related papers (2021-05-09T05:35:36Z) - Cross-domain Speech Recognition with Unsupervised Character-level
Distribution Matching [60.8427677151492]
We propose CMatch, a Character-level distribution matching method to perform fine-grained adaptation between each character in two domains.
Experiments on the Libri-Adapt dataset show that our proposed approach achieves 14.39% and 16.50% relative Word Error Rate (WER) reduction on both cross-device and cross-environment ASR.
arXiv Detail & Related papers (2021-04-15T14:36:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.