SoftCorrect: Error Correction with Soft Detection for Automatic Speech
Recognition
- URL: http://arxiv.org/abs/2212.01039v2
- Date: Wed, 20 Dec 2023 15:00:43 GMT
- Title: SoftCorrect: Error Correction with Soft Detection for Automatic Speech
Recognition
- Authors: Yichong Leng, Xu Tan, Wenjie Liu, Kaitao Song, Rui Wang, Xiang-Yang
Li, Tao Qin, Edward Lin, Tie-Yan Liu
- Abstract summary: We propose SoftCorrect with a soft error detection mechanism to avoid the limitations of both explicit and implicit error detection.
Compared with implicit error detection with CTC loss, SoftCorrect provides explicit signal about which words are incorrect.
Experiments on AISHELL-1 and Aidatatang datasets show that SoftCorrect achieves 26.1% and 9.4% CER reduction respectively.
- Score: 116.31926128970585
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Error correction in automatic speech recognition (ASR) aims to correct those
incorrect words in sentences generated by ASR models. Since recent ASR models
usually have low word error rate (WER), to avoid affecting originally correct
tokens, error correction models should only modify incorrect words, and
therefore detecting incorrect words is important for error correction. Previous
works on error correction either implicitly detect error words through
target-source attention or CTC (connectionist temporal classification) loss, or
explicitly locate specific deletion/substitution/insertion errors. However,
implicit error detection does not provide clear signal about which tokens are
incorrect and explicit error detection suffers from low detection accuracy. In
this paper, we propose SoftCorrect with a soft error detection mechanism to
avoid the limitations of both explicit and implicit error detection.
Specifically, we first detect whether a token is correct or not through a
probability produced by a dedicatedly designed language model, and then design
a constrained CTC loss that only duplicates the detected incorrect tokens to
let the decoder focus on the correction of error tokens. Compared with implicit
error detection with CTC loss, SoftCorrect provides explicit signal about which
words are incorrect and thus does not need to duplicate every token but only
incorrect tokens; compared with explicit error detection, SoftCorrect does not
detect specific deletion/substitution/insertion errors but just leaves it to
CTC loss. Experiments on AISHELL-1 and Aidatatang datasets show that
SoftCorrect achieves 26.1% and 9.4% CER reduction respectively, outperforming
previous works by a large margin, while still enjoying fast speed of parallel
generation.
Related papers
- A Coin Has Two Sides: A Novel Detector-Corrector Framework for Chinese Spelling Correction [79.52464132360618]
Chinese Spelling Correction (CSC) stands as a foundational Natural Language Processing (NLP) task.
We introduce a novel approach based on error detector-corrector framework.
Our detector is designed to yield two error detection results, each characterized by high precision and recall.
arXiv Detail & Related papers (2024-09-06T09:26:45Z) - Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition [52.624909026294105]
We propose a non-autoregressive speech error correction method.
A Confidence Module measures the uncertainty of each word of the N-best ASR hypotheses.
The proposed system reduces the error rate by 21% compared with the ASR model.
arXiv Detail & Related papers (2024-06-29T17:56:28Z) - An Error-Guided Correction Model for Chinese Spelling Error Correction [13.56600372085612]
We propose an error-guided correction model (EGCM) to improve Chinese spelling correction.
Our model achieves superior performance against state-of-the-art approaches by a remarkable margin.
arXiv Detail & Related papers (2023-01-16T09:27:45Z) - Mask the Correct Tokens: An Embarrassingly Simple Approach for Error
Correction [38.463639262607174]
Previous error correction methods usually take the source (incorrect) sentence as encoder input and generate the target (correct) sentence through the decoder.
We propose a simple yet effective masking strategy to achieve this goal.
arXiv Detail & Related papers (2022-11-23T19:05:48Z) - ASR Error Correction with Constrained Decoding on Operation Prediction [8.701142327932484]
We propose an ASR error correction method utilizing the predictions of correction operations.
Experiments on three public datasets demonstrate the effectiveness of the proposed approach in reducing the latency of the decoding process.
arXiv Detail & Related papers (2022-08-09T09:59:30Z) - FastCorrect 2: Fast Error Correction on Multiple Candidates for
Automatic Speech Recognition [92.12910821300034]
We propose FastCorrect 2, an error correction model that takes multiple ASR candidates as input for better correction accuracy.
FastCorrect 2 achieves better performance than the cascaded re-scoring and correction pipeline.
arXiv Detail & Related papers (2021-09-29T13:48:03Z) - FastCorrect: Fast Error Correction with Edit Alignment for Automatic
Speech Recognition [90.34177266618143]
We propose FastCorrect, a novel NAR error correction model based on edit alignment.
FastCorrect speeds up the inference by 6-9 times and maintains the accuracy (8-14% WER reduction) compared with the autoregressive correction model.
It outperforms the accuracy of popular NAR models adopted in neural machine translation by a large margin.
arXiv Detail & Related papers (2021-05-09T05:35:36Z) - On the Robustness of Language Encoders against Grammatical Errors [66.05648604987479]
We collect real grammatical errors from non-native speakers and conduct adversarial attacks to simulate these errors on clean text data.
Results confirm that the performance of all tested models is affected but the degree of impact varies.
arXiv Detail & Related papers (2020-05-12T11:01:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.