Related papers: Type-Driven Multi-Turn Corrections for Grammatical Error Correction

Type-Driven Multi-Turn Corrections for Grammatical Error Correction

URL: http://arxiv.org/abs/2203.09136v1
Date: Thu, 17 Mar 2022 07:30:05 GMT
Title: Type-Driven Multi-Turn Corrections for Grammatical Error Correction
Authors: Shaopeng Lai, Qingyu Zhou, Jiali Zeng, Zhongli Li, Chao Li, Yunbo Cao, Jinsong Su
Abstract summary: Grammatical Error Correction (GEC) aims to automatically detect and correct grammatical errors. Previous studies mainly focus on the data augmentation approach to combat the exposure bias. We propose a Type-Driven Multi-Turn Corrections approach for GEC.
Score: 46.34114495164071
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Grammatical Error Correction (GEC) aims to automatically detect and correct grammatical errors. In this aspect, dominant models are trained by one-iteration learning while performing multiple iterations of corrections during inference. Previous studies mainly focus on the data augmentation approach to combat the exposure bias, which suffers from two drawbacks. First, they simply mix additionally-constructed training instances and original ones to train models, which fails to help models be explicitly aware of the procedure of gradual corrections. Second, they ignore the interdependence between different types of corrections. In this paper, we propose a Type-Driven Multi-Turn Corrections approach for GEC. Using this approach, from each training instance, we additionally construct multiple training instances, each of which involves the correction of a specific type of errors. Then, we use these additionally-constructed training instances and the original one to train the model in turn. Experimental results and in-depth analysis show that our approach significantly benefits the model training. Particularly, our enhanced model achieves state-of-the-art single-model performance on English GEC benchmarks. We release our code at Github.

Related papers

Efficient and Interpretable Grammatical Error Correction with Mixture of Experts [33.748193858033346]
We propose a mixture-of-experts model, MoECE, for grammatical error correction. Our model successfully achieves the performance of T5-XL with three times fewer effective parameters.
arXiv Detail & Related papers (2024-10-30T23:27:54Z)
Self-calibration for Language Model Quantization and Pruning [38.00221764773372]
Quantization and pruning are fundamental approaches for model compression. In a post-training setting, state-of-the-art quantization and pruning methods require calibration data. We propose self-calibration as a solution.
arXiv Detail & Related papers (2024-10-22T16:50:00Z)
Subtle Errors Matter: Preference Learning via Error-injected Self-editing [59.405145971637204]
We propose a novel preference learning framework called eRror-Injected Self-Editing (RISE) RISE injects predefined subtle errors into partial tokens of correct solutions to construct hard pairs for error mitigation. Experiments validate the effectiveness of RISE, with preference learning on Qwen2-7B-Instruct yielding notable improvements of 3.0% on GSM8K and 7.9% on MATH.
arXiv Detail & Related papers (2024-10-09T07:43:38Z)
Training Language Models to Self-Correct via Reinforcement Learning [98.35197671595343]
Self-correction has been found to be largely ineffective in modern large language models (LLMs) We develop a multi-turn online reinforcement learning approach, SCoRe, that significantly improves an LLM's self-correction ability using entirely self-generated data. We find that SCoRe achieves state-of-the-art self-correction performance, improving the base models' self-correction by 15.6% and 9.1% respectively on MATH and HumanEval.
arXiv Detail & Related papers (2024-09-19T17:16:21Z)
Efficient Grammatical Error Correction Via Multi-Task Training and Optimized Training Schedule [55.08778142798106]
We propose auxiliary tasks that exploit the alignment between the original and corrected sentences. We formulate each task as a sequence-to-sequence problem and perform multi-task training. We find that the order of datasets used for training and even individual instances within a dataset may have important effects on the final performance.
arXiv Detail & Related papers (2023-11-20T14:50:12Z)
Chinese Spelling Correction as Rephrasing Language Model [63.65217759957206]
We study Chinese Spelling Correction (CSC), which aims to detect and correct the potential spelling errors in a given sentence. Current state-of-the-art methods regard CSC as a sequence tagging task and fine-tune BERT-based models on sentence pairs. We propose Rephrasing Language Model (ReLM), where the model is trained to rephrase the entire sentence by infilling additional slots, instead of character-to-character tagging.
arXiv Detail & Related papers (2023-08-17T06:04:28Z)
Rethinking Masked Language Modeling for Chinese Spelling Correction [70.85829000570203]
We study Chinese Spelling Correction (CSC) as a joint decision made by two separate models: a language model and an error model. We find that fine-tuning BERT tends to over-fit the error model while under-fit the language model, resulting in poor generalization to out-of-distribution error patterns. We demonstrate that a very simple strategy, randomly masking 20% non-error tokens from the input sequence during fine-tuning is sufficient for learning a much better language model without sacrificing the error model.
arXiv Detail & Related papers (2023-05-28T13:19:12Z)
Exploration and Exploitation: Two Ways to Improve Chinese Spelling Correction Models [51.744357472072416]
We propose a method, which continually identifies the weak spots of a model to generate more valuable training instances. Experimental results show that such an adversarial training method combined with the pretraining strategy can improve both the generalization and robustness of multiple CSC models.
arXiv Detail & Related papers (2021-05-31T09:17:33Z)
Grammatical Error Correction as GAN-like Sequence Labeling [45.19453732703053]
We propose a GAN-like sequence labeling model, which consists of a grammatical error detector as a discriminator and a grammatical error labeler with Gumbel-Softmax sampling as a generator. Our results on several evaluation benchmarks demonstrate that our proposed approach is effective and improves the previous state-of-the-art baseline.
arXiv Detail & Related papers (2021-05-29T04:39:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.