Grammatical Error Correction via Mixed-Grained Weighted Training
- URL: http://arxiv.org/abs/2311.13848v1
- Date: Thu, 23 Nov 2023 08:34:37 GMT
- Title: Grammatical Error Correction via Mixed-Grained Weighted Training
- Authors: Jiahao Li, Quan Wang, Chiwei Zhu, Zhendong Mao, Yongdong Zhang
- Abstract summary: Grammatical Error Correction (GEC) aims to automatically correct grammatical errors in natural texts.
MainGEC designs token-level and sentence-level training weights based on inherent discrepancies in accuracy and potential diversity of data annotation.
- Score: 68.94921674855621
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The task of Grammatical Error Correction (GEC) aims to automatically correct
grammatical errors in natural texts. Almost all previous works treat annotated
training data equally, but inherent discrepancies in data are neglected. In
this paper, the inherent discrepancies are manifested in two aspects, namely,
accuracy of data annotation and diversity of potential annotations. To this
end, we propose MainGEC, which designs token-level and sentence-level training
weights based on inherent discrepancies in accuracy and potential diversity of
data annotation, respectively, and then conducts mixed-grained weighted
training to improve the training effect for GEC. Empirical evaluation shows
that whether in the Seq2Seq or Seq2Edit manner, MainGEC achieves consistent and
significant performance improvements on two benchmark datasets, demonstrating
the effectiveness and superiority of the mixed-grained weighted training.
Further ablation experiments verify the effectiveness of designed weights of
both granularities in MainGEC.
Related papers
- Refining CART Models for Covariate Shift with Importance Weight [0.0]
This paper introduces an adaptation of Classification and Regression Trees (CART) that incorporates importance weighting to address these distributional differences effectively.
We evaluate the effectiveness of this method through simulation studies and apply it to real-world medical data, showing significant improvements in predictive accuracy.
arXiv Detail & Related papers (2024-10-28T12:53:23Z) - Gradient Reweighting: Towards Imbalanced Class-Incremental Learning [8.438092346233054]
Class-Incremental Learning (CIL) trains a model to continually recognize new classes from non-stationary data.
A major challenge of CIL arises when applying to real-world data characterized by non-uniform distribution.
We show that this dual imbalance issue causes skewed gradient updates with biased weights in FC layers, thus inducing over/under-fitting and catastrophic forgetting in CIL.
arXiv Detail & Related papers (2024-02-28T18:08:03Z) - Efficient Grammatical Error Correction Via Multi-Task Training and
Optimized Training Schedule [55.08778142798106]
We propose auxiliary tasks that exploit the alignment between the original and corrected sentences.
We formulate each task as a sequence-to-sequence problem and perform multi-task training.
We find that the order of datasets used for training and even individual instances within a dataset may have important effects on the final performance.
arXiv Detail & Related papers (2023-11-20T14:50:12Z) - On the Validation of Gibbs Algorithms: Training Datasets, Test Datasets
and their Aggregation [70.540936204654]
dependence on training data of the Gibbs algorithm (GA) is analytically characterized.
This description enables the development of explicit expressions involving the training errors and test errors of GAs trained with different datasets.
arXiv Detail & Related papers (2023-06-21T16:51:50Z) - On the Language Coverage Bias for Neural Machine Translation [81.81456880770762]
Language coverage bias is important for neural machine translation (NMT) because the target-original training data is not well exploited in current practice.
By carefully designing experiments, we provide comprehensive analyses of the language coverage bias in the training data.
We propose two simple and effective approaches to alleviate the language coverage bias problem.
arXiv Detail & Related papers (2021-06-07T01:55:34Z) - Ensemble Distillation Approaches for Grammatical Error Correction [18.81579562876076]
Ensemble distillation (EnD) and ensemble distribution distillation (EnDD) have been proposed that compress the ensemble into a single model.
This paper examines the application of both these distillation approaches to a sequence prediction task, grammatical error correction (GEC)
It is, however, more challenging than the standard tasks investigated for distillation as the prediction of any grammatical correction to a word will be highly dependent on both the input sequence and the generated output history for the word.
arXiv Detail & Related papers (2020-11-24T15:00:45Z) - Improving the Efficiency of Grammatical Error Correction with Erroneous
Span Detection and Correction [106.63733511672721]
We propose a novel language-independent approach to improve the efficiency for Grammatical Error Correction (GEC) by dividing the task into two subtasks: Erroneous Span Detection ( ESD) and Erroneous Span Correction (ESC)
ESD identifies grammatically incorrect text spans with an efficient sequence tagging model. ESC leverages a seq2seq model to take the sentence with annotated erroneous spans as input and only outputs the corrected text for these spans.
Experiments show our approach performs comparably to conventional seq2seq approaches in both English and Chinese GEC benchmarks with less than 50% time cost for inference.
arXiv Detail & Related papers (2020-10-07T08:29:11Z) - A Self-Refinement Strategy for Noise Reduction in Grammatical Error
Correction [54.569707226277735]
Existing approaches for grammatical error correction (GEC) rely on supervised learning with manually created GEC datasets.
There is a non-negligible amount of "noise" where errors were inappropriately edited or left uncorrected.
We propose a self-refinement method where the key idea is to denoise these datasets by leveraging the prediction consistency of existing models.
arXiv Detail & Related papers (2020-10-07T04:45:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.