Related papers: End-to-End Spoken Grammatical Error Correction

End-to-End Spoken Grammatical Error Correction

URL: http://arxiv.org/abs/2506.18532v1
Date: Mon, 23 Jun 2025 11:40:04 GMT
Title: End-to-End Spoken Grammatical Error Correction
Authors: Mengjie Qian, Rao Ma, Stefano Bannò, Mark J. F. Gales, Kate M. Knill,
Abstract summary: Grammatical Error Correction (GEC) and feedback play a vital role in supporting second language (L2) learners, educators, and examiners.<n>While written GEC is well-established, spoken GEC (SGEC) poses additional challenges due to disfluencies, transcription errors, and the lack of structured input.<n>This work examines an End-to-End (E2E) framework for SGEC and feedback generation, highlighting challenges and possible solutions.
Score: 33.116296120680296
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Grammatical Error Correction (GEC) and feedback play a vital role in supporting second language (L2) learners, educators, and examiners. While written GEC is well-established, spoken GEC (SGEC), aiming to provide feedback based on learners' speech, poses additional challenges due to disfluencies, transcription errors, and the lack of structured input. SGEC systems typically follow a cascaded pipeline consisting of Automatic Speech Recognition (ASR), disfluency detection, and GEC, making them vulnerable to error propagation across modules. This work examines an End-to-End (E2E) framework for SGEC and feedback generation, highlighting challenges and possible solutions when developing these systems. Cascaded, partial-cascaded and E2E architectures are compared, all built on the Whisper foundation model. A challenge for E2E systems is the scarcity of GEC labeled spoken data. To address this, an automatic pseudo-labeling framework is examined, increasing the training data from 77 to over 2500 hours. To improve the accuracy of the SGEC system, additional contextual information, exploiting the ASR output, is investigated. Candidate feedback of their mistakes is an essential step to improving performance. In E2E systems the SGEC output must be compared with an estimate of the fluent transcription to obtain the feedback. To improve the precision of this feedback, a novel reference alignment process is proposed that aims to remove hypothesised edits that results from fluent transcription errors. Finally, these approaches are combined with an edit confidence estimation approach, to exclude low-confidence edits. Experiments on the in-house Linguaskill (LNG) corpora and the publicly available Speak & Improve (S&I) corpus show that the proposed approaches significantly boost E2E SGEC performance.

Related papers

Scaling and Prompting for Improved End-to-End Spoken Grammatical Error Correction [33.116296120680296]
This work introduces a pseudo-labelling process to address the challenge of limited labelled data.<n>We prompt an E2E Whisper-based SGEC model with fluent transcriptions, showing a slight improvement in SGEC performance.<n>Finally, we assess the impact of increasing model size, revealing that while pseudo-labelled data does not yield performance gain for a larger Whisper model, training with prompts proves beneficial.
arXiv Detail & Related papers (2025-05-27T12:50:53Z)
Chinese Grammatical Error Correction: A Survey [2.6914312267666705]
Chinese Grammatical Error Correction (CGEC) is a critical task in Natural Language Processing.<n>CGEC addresses the growing demand for automated writing assistance in both second-language (L2) and native (L1) Chinese writing.<n>This survey provides a comprehensive review of CGEC research, covering datasets, annotation schemes, evaluation methodologies, and system advancements.
arXiv Detail & Related papers (2025-04-01T17:14:50Z)
Corrections Meet Explanations: A Unified Framework for Explainable Grammatical Error Correction [29.583603444317855]
We introduce EXGEC, a unified explainable GEC framework that integrates explanation and correction tasks in a generative manner.<n>Results on various NLP models (BART, T5, and Llama3) show that EXGEC models surpass single-task baselines in both tasks.
arXiv Detail & Related papers (2025-02-21T07:42:33Z)
Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation [73.9145653659403]
We show that Generative Error Correction models struggle to generalize beyond the specific types of errors encountered during training. We propose DARAG, a novel approach designed to improve GEC for ASR in in-domain (ID) and OOD scenarios. Our approach is simple, scalable, and both domain- and language-agnostic.
arXiv Detail & Related papers (2024-10-17T04:00:29Z)
Towards interfacing large language models with ASR systems using confidence measures and prompting [54.39667883394458]
This work investigates post-hoc correction of ASR transcripts with large language models (LLMs) To avoid introducing errors into likely accurate transcripts, we propose a range of confidence-based filtering methods. Our results indicate that this can improve the performance of less competitive ASR systems.
arXiv Detail & Related papers (2024-07-31T08:00:41Z)
Robust ASR Error Correction with Conservative Data Filtering [15.833428810891427]
Error correction (EC) based on large language models is an emerging technology to enhance the performance of automatic speech recognition (ASR) systems. We propose two fundamental criteria that EC training data should satisfy. We identify low-quality EC pairs and train the models not to make any correction in such cases.
arXiv Detail & Related papers (2024-07-18T09:05:49Z)
Grammatical Error Correction via Mixed-Grained Weighted Training [68.94921674855621]
Grammatical Error Correction (GEC) aims to automatically correct grammatical errors in natural texts. MainGEC designs token-level and sentence-level training weights based on inherent discrepancies in accuracy and potential diversity of data annotation.
arXiv Detail & Related papers (2023-11-23T08:34:37Z)
Towards End-to-End Spoken Grammatical Error Correction [33.116296120680296]
Spoken grammatical error correction (GEC) aims to supply feedback to L2 learners on their use of grammar when speaking. This paper introduces an alternative "end-to-end" approach to spoken GEC, exploiting a speech recognition foundation model, Whisper.
arXiv Detail & Related papers (2023-11-09T17:49:02Z)
RobustGEC: Robust Grammatical Error Correction Against Subtle Context Perturbation [64.2568239429946]
We introduce RobustGEC, a benchmark designed to evaluate the context robustness of GEC systems. We reveal that state-of-the-art GEC systems still lack sufficient robustness against context perturbations.
arXiv Detail & Related papers (2023-10-11T08:33:23Z)
Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction [106.63733511672721]
We propose a novel language-independent approach to improve the efficiency for Grammatical Error Correction (GEC) by dividing the task into two subtasks: Erroneous Span Detection ( ESD) and Erroneous Span Correction (ESC) ESD identifies grammatically incorrect text spans with an efficient sequence tagging model. ESC leverages a seq2seq model to take the sentence with annotated erroneous spans as input and only outputs the corrected text for these spans. Experiments show our approach performs comparably to conventional seq2seq approaches in both English and Chinese GEC benchmarks with less than 50% time cost for inference.
arXiv Detail & Related papers (2020-10-07T08:29:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.