Type-Driven Multi-Turn Corrections for Grammatical Error Correction
- URL: http://arxiv.org/abs/2203.09136v1
- Date: Thu, 17 Mar 2022 07:30:05 GMT
- Title: Type-Driven Multi-Turn Corrections for Grammatical Error Correction
- Authors: Shaopeng Lai, Qingyu Zhou, Jiali Zeng, Zhongli Li, Chao Li, Yunbo Cao,
Jinsong Su
- Abstract summary: Grammatical Error Correction (GEC) aims to automatically detect and correct grammatical errors.
Previous studies mainly focus on the data augmentation approach to combat the exposure bias.
We propose a Type-Driven Multi-Turn Corrections approach for GEC.
- Score: 46.34114495164071
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Grammatical Error Correction (GEC) aims to automatically detect and correct
grammatical errors. In this aspect, dominant models are trained by
one-iteration learning while performing multiple iterations of corrections
during inference. Previous studies mainly focus on the data augmentation
approach to combat the exposure bias, which suffers from two drawbacks. First,
they simply mix additionally-constructed training instances and original ones
to train models, which fails to help models be explicitly aware of the
procedure of gradual corrections. Second, they ignore the interdependence
between different types of corrections. In this paper, we propose a Type-Driven
Multi-Turn Corrections approach for GEC. Using this approach, from each
training instance, we additionally construct multiple training instances, each
of which involves the correction of a specific type of errors. Then, we use
these additionally-constructed training instances and the original one to train
the model in turn. Experimental results and in-depth analysis show that our
approach significantly benefits the model training. Particularly, our enhanced
model achieves state-of-the-art single-model performance on English GEC
benchmarks. We release our code at Github.
Related papers
- Efficient and Interpretable Grammatical Error Correction with Mixture of Experts [33.748193858033346]
We propose a mixture-of-experts model, MoECE, for grammatical error correction.
Our model successfully achieves the performance of T5-XL with three times fewer effective parameters.
arXiv Detail & Related papers (2024-10-30T23:27:54Z) - Self-calibration for Language Model Quantization and Pruning [38.00221764773372]
Quantization and pruning are fundamental approaches for model compression.
In a post-training setting, state-of-the-art quantization and pruning methods require calibration data.
We propose self-calibration as a solution.
arXiv Detail & Related papers (2024-10-22T16:50:00Z) - Subtle Errors Matter: Preference Learning via Error-injected Self-editing [59.405145971637204]
We propose a novel preference learning framework called eRror-Injected Self-Editing (RISE)
RISE injects predefined subtle errors into partial tokens of correct solutions to construct hard pairs for error mitigation.
Experiments validate the effectiveness of RISE, with preference learning on Qwen2-7B-Instruct yielding notable improvements of 3.0% on GSM8K and 7.9% on MATH.
arXiv Detail & Related papers (2024-10-09T07:43:38Z) - Training Language Models to Self-Correct via Reinforcement Learning [98.35197671595343]
Self-correction has been found to be largely ineffective in modern large language models (LLMs)
We develop a multi-turn online reinforcement learning approach, SCoRe, that significantly improves an LLM's self-correction ability using entirely self-generated data.
We find that SCoRe achieves state-of-the-art self-correction performance, improving the base models' self-correction by 15.6% and 9.1% respectively on MATH and HumanEval.
arXiv Detail & Related papers (2024-09-19T17:16:21Z) - Chinese Spelling Correction as Rephrasing Language Model [63.65217759957206]
We study Chinese Spelling Correction (CSC), which aims to detect and correct the potential spelling errors in a given sentence.
Current state-of-the-art methods regard CSC as a sequence tagging task and fine-tune BERT-based models on sentence pairs.
We propose Rephrasing Language Model (ReLM), where the model is trained to rephrase the entire sentence by infilling additional slots, instead of character-to-character tagging.
arXiv Detail & Related papers (2023-08-17T06:04:28Z) - Rethinking Masked Language Modeling for Chinese Spelling Correction [70.85829000570203]
We study Chinese Spelling Correction (CSC) as a joint decision made by two separate models: a language model and an error model.
We find that fine-tuning BERT tends to over-fit the error model while under-fit the language model, resulting in poor generalization to out-of-distribution error patterns.
We demonstrate that a very simple strategy, randomly masking 20% non-error tokens from the input sequence during fine-tuning is sufficient for learning a much better language model without sacrificing the error model.
arXiv Detail & Related papers (2023-05-28T13:19:12Z) - Exploration and Exploitation: Two Ways to Improve Chinese Spelling
Correction Models [51.744357472072416]
We propose a method, which continually identifies the weak spots of a model to generate more valuable training instances.
Experimental results show that such an adversarial training method combined with the pretraining strategy can improve both the generalization and robustness of multiple CSC models.
arXiv Detail & Related papers (2021-05-31T09:17:33Z) - Grammatical Error Correction as GAN-like Sequence Labeling [45.19453732703053]
We propose a GAN-like sequence labeling model, which consists of a grammatical error detector as a discriminator and a grammatical error labeler with Gumbel-Softmax sampling as a generator.
Our results on several evaluation benchmarks demonstrate that our proposed approach is effective and improves the previous state-of-the-art baseline.
arXiv Detail & Related papers (2021-05-29T04:39:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.