Improving the Efficiency of Grammatical Error Correction with Erroneous
Span Detection and Correction
- URL: http://arxiv.org/abs/2010.03260v1
- Date: Wed, 7 Oct 2020 08:29:11 GMT
- Title: Improving the Efficiency of Grammatical Error Correction with Erroneous
Span Detection and Correction
- Authors: Mengyun Chen, Tao Ge, Xingxing Zhang, Furu Wei, Ming Zhou
- Abstract summary: We propose a novel language-independent approach to improve the efficiency for Grammatical Error Correction (GEC) by dividing the task into two subtasks: Erroneous Span Detection ( ESD) and Erroneous Span Correction (ESC)
ESD identifies grammatically incorrect text spans with an efficient sequence tagging model. ESC leverages a seq2seq model to take the sentence with annotated erroneous spans as input and only outputs the corrected text for these spans.
Experiments show our approach performs comparably to conventional seq2seq approaches in both English and Chinese GEC benchmarks with less than 50% time cost for inference.
- Score: 106.63733511672721
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a novel language-independent approach to improve the efficiency
for Grammatical Error Correction (GEC) by dividing the task into two subtasks:
Erroneous Span Detection (ESD) and Erroneous Span Correction (ESC). ESD
identifies grammatically incorrect text spans with an efficient sequence
tagging model. Then, ESC leverages a seq2seq model to take the sentence with
annotated erroneous spans as input and only outputs the corrected text for
these spans. Experiments show our approach performs comparably to conventional
seq2seq approaches in both English and Chinese GEC benchmarks with less than
50% time cost for inference.
Related papers
- EdaCSC: Two Easy Data Augmentation Methods for Chinese Spelling Correction [0.0]
Chinese Spelling Correction (CSC) aims to detect and correct spelling errors in Chinese sentences caused by phonetic or visual similarities.
We propose two data augmentation methods to address these limitations.
Firstly, we augment the dataset by either splitting long sentences into shorter ones or reducing typos in sentences with multiple typos.
arXiv Detail & Related papers (2024-09-08T14:29:10Z) - A Coin Has Two Sides: A Novel Detector-Corrector Framework for Chinese Spelling Correction [79.52464132360618]
Chinese Spelling Correction (CSC) stands as a foundational Natural Language Processing (NLP) task.
We introduce a novel approach based on error detector-corrector framework.
Our detector is designed to yield two error detection results, each characterized by high precision and recall.
arXiv Detail & Related papers (2024-09-06T09:26:45Z) - LM-Combiner: A Contextual Rewriting Model for Chinese Grammatical Error Correction [49.0746090186582]
Over-correction is a critical problem in Chinese grammatical error correction (CGEC) task.
Recent work using model ensemble methods can effectively mitigate over-correction and improve the precision of the GEC system.
We propose the LM-Combiner, a rewriting model that can directly modify the over-correction of GEC system outputs without a model ensemble.
arXiv Detail & Related papers (2024-03-26T06:12:21Z) - Improving Seq2Seq Grammatical Error Correction via Decoding
Interventions [40.52259641181596]
We propose a unified decoding intervention framework that employs an external critic to assess the appropriateness of the token to be generated incrementally.
We discover and investigate two types of critics: a pre-trained left-to-right language model critic and an incremental target-side grammatical error detector critic.
Our framework consistently outperforms strong baselines and achieves results competitive with state-of-the-art methods.
arXiv Detail & Related papers (2023-10-23T03:36:37Z) - Chinese Spelling Correction as Rephrasing Language Model [63.65217759957206]
We study Chinese Spelling Correction (CSC), which aims to detect and correct the potential spelling errors in a given sentence.
Current state-of-the-art methods regard CSC as a sequence tagging task and fine-tune BERT-based models on sentence pairs.
We propose Rephrasing Language Model (ReLM), where the model is trained to rephrase the entire sentence by infilling additional slots, instead of character-to-character tagging.
arXiv Detail & Related papers (2023-08-17T06:04:28Z) - From Spelling to Grammar: A New Framework for Chinese Grammatical Error
Correction [12.170714706174314]
Chinese Grammatical Error Correction (CGEC) aims to generate a correct sentence from an erroneous sequence.
This paper divides the CGEC task into two steps, namely spelling error correction and grammatical error correction.
We propose a novel zero-shot approach for spelling error correction, which is simple but effective.
To handle grammatical error correction, we design part-of-speech features and semantic class features to enhance the neural network model.
arXiv Detail & Related papers (2022-11-03T07:30:09Z) - A Syntax-Guided Grammatical Error Correction Model with Dependency Tree
Correction [83.14159143179269]
Grammatical Error Correction (GEC) is a task of detecting and correcting grammatical errors in sentences.
We propose a syntax-guided GEC model (SG-GEC) which adopts the graph attention mechanism to utilize the syntactic knowledge of dependency trees.
We evaluate our model on public benchmarks of GEC task and it achieves competitive results.
arXiv Detail & Related papers (2021-11-05T07:07:48Z) - FastCorrect 2: Fast Error Correction on Multiple Candidates for
Automatic Speech Recognition [92.12910821300034]
We propose FastCorrect 2, an error correction model that takes multiple ASR candidates as input for better correction accuracy.
FastCorrect 2 achieves better performance than the cascaded re-scoring and correction pipeline.
arXiv Detail & Related papers (2021-09-29T13:48:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.