HTEC: Human Transcription Error Correction
- URL: http://arxiv.org/abs/2309.10089v1
- Date: Mon, 18 Sep 2023 19:03:21 GMT
- Title: HTEC: Human Transcription Error Correction
- Authors: Hanbo Sun, Jian Gao, Xiaomin Wu, Anjie Fang, Cheng Cao, Zheng Du
- Abstract summary: High-quality human transcription is essential for training and improving Automatic Speech Recognition (ASR) models.
We propose HTEC for Human Transcription Error Correction.
HTEC consists of two stages: Trans-Checker, an error detection model that predicts and masks erroneous words, and Trans-Filler, a sequence-to-sequence generative model that fills masked positions.
- Score: 4.241671683889168
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: High-quality human transcription is essential for training and improving
Automatic Speech Recognition (ASR) models. Recent study~\cite{libricrowd} has
found that every 1% worse transcription Word Error Rate (WER) increases
approximately 2% ASR WER by using the transcriptions to train ASR models.
Transcription errors are inevitable for even highly-trained annotators.
However, few studies have explored human transcription correction. Error
correction methods for other problems, such as ASR error correction and
grammatical error correction, do not perform sufficiently for this problem.
Therefore, we propose HTEC for Human Transcription Error Correction. HTEC
consists of two stages: Trans-Checker, an error detection model that predicts
and masks erroneous words, and Trans-Filler, a sequence-to-sequence generative
model that fills masked positions. We propose a holistic list of correction
operations, including four novel operations handling deletion errors. We
further propose a variant of embeddings that incorporates phoneme information
into the input of the transformer. HTEC outperforms other methods by a large
margin and surpasses human annotators by 2.2% to 4.5% in WER. Finally, we
deployed HTEC to assist human annotators and showed HTEC is particularly
effective as a co-pilot, which improves transcription quality by 15.1% without
sacrificing transcription velocity.
Related papers
- A Coin Has Two Sides: A Novel Detector-Corrector Framework for Chinese Spelling Correction [79.52464132360618]
Chinese Spelling Correction (CSC) stands as a foundational Natural Language Processing (NLP) task.
We introduce a novel approach based on error detector-corrector framework.
Our detector is designed to yield two error detection results, each characterized by high precision and recall.
arXiv Detail & Related papers (2024-09-06T09:26:45Z) - Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition [52.624909026294105]
We propose a non-autoregressive speech error correction method.
A Confidence Module measures the uncertainty of each word of the N-best ASR hypotheses.
The proposed system reduces the error rate by 21% compared with the ASR model.
arXiv Detail & Related papers (2024-06-29T17:56:28Z) - Generative error correction for code-switching speech recognition using
large language models [49.06203730433107]
Code-switching (CS) speech refers to the phenomenon of mixing two or more languages within the same sentence.
We propose to leverage large language models (LLMs) and lists of hypotheses generated by an ASR to address the CS problem.
arXiv Detail & Related papers (2023-10-17T14:49:48Z) - Human Transcription Quality Improvement [2.24166568188073]
We introduce two mechanisms to improve transcription quality: confidence estimation based reprocessing at labeling stage, and automatic word error correction at post-labeling stage.
We collect and release LibriCrowd - a large-scale crowdsourced dataset of audio transcriptions on 100 hours of English speech.
arXiv Detail & Related papers (2023-09-24T03:39:43Z) - Chinese Spelling Correction as Rephrasing Language Model [63.65217759957206]
We study Chinese Spelling Correction (CSC), which aims to detect and correct the potential spelling errors in a given sentence.
Current state-of-the-art methods regard CSC as a sequence tagging task and fine-tune BERT-based models on sentence pairs.
We propose Rephrasing Language Model (ReLM), where the model is trained to rephrase the entire sentence by infilling additional slots, instead of character-to-character tagging.
arXiv Detail & Related papers (2023-08-17T06:04:28Z) - Correct Like Humans: Progressive Learning Framework for Chinese Text Error Correction [28.25789161365667]
Chinese Text Error Correction (CTEC) aims to detect and correct errors in the input text.
Recent approaches mainly employ Pre-trained Language Models (PLMs) to resolve CTEC.
We propose a novel model-agnostic progressive learning framework, named ProTEC, which guides PLMs-based CTEC models to learn to correct like humans.
arXiv Detail & Related papers (2023-06-30T07:44:49Z) - ASR Error Detection via Audio-Transcript entailment [1.3750624267664155]
We propose an end-to-end approach for ASR error detection using audio-transcript entailment.
The proposed model utilizes an acoustic encoder and a linguistic encoder to model the speech and transcript respectively.
Our proposed model achieves classification error rates (CER) of 26.2% on all transcription errors and 23% on medical errors specifically, leading to improvements upon a strong baseline by 12% and 15.4%, respectively.
arXiv Detail & Related papers (2022-07-22T02:47:15Z) - Automatic Correction of Human Translations [8.137198664755598]
We introduce translation error correction (TEC), the task of automatically correcting human-generated translations.
We show that human errors in TEC exhibit a more diverse range of errors and far fewer translation errors than the MT errors in automatic post-editing datasets.
arXiv Detail & Related papers (2022-06-17T07:30:55Z) - Improving Translation Robustness with Visual Cues and Error Correction [58.97421756225425]
We introduce the idea of visual context to improve translation robustness against noisy texts.
We also propose a novel error correction training regime by treating error correction as an auxiliary task.
arXiv Detail & Related papers (2021-03-12T15:31:34Z) - Improving the Efficiency of Grammatical Error Correction with Erroneous
Span Detection and Correction [106.63733511672721]
We propose a novel language-independent approach to improve the efficiency for Grammatical Error Correction (GEC) by dividing the task into two subtasks: Erroneous Span Detection ( ESD) and Erroneous Span Correction (ESC)
ESD identifies grammatically incorrect text spans with an efficient sequence tagging model. ESC leverages a seq2seq model to take the sentence with annotated erroneous spans as input and only outputs the corrected text for these spans.
Experiments show our approach performs comparably to conventional seq2seq approaches in both English and Chinese GEC benchmarks with less than 50% time cost for inference.
arXiv Detail & Related papers (2020-10-07T08:29:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.