System Combination via Quality Estimation for Grammatical Error
Correction
- URL: http://arxiv.org/abs/2310.14947v1
- Date: Mon, 23 Oct 2023 13:46:49 GMT
- Title: System Combination via Quality Estimation for Grammatical Error
Correction
- Authors: Muhammad Reza Qorib and Hwee Tou Ng
- Abstract summary: We propose GRECO, a new state-of-the-art quality estimation model that gives a better estimate of the quality of a corrected sentence.
We also propose three methods for utilizing GEC quality estimation models for system combination with varying generality.
- Score: 29.91720235173108
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Quality estimation models have been developed to assess the corrections made
by grammatical error correction (GEC) models when the reference or
gold-standard corrections are not available. An ideal quality estimator can be
utilized to combine the outputs of multiple GEC systems by choosing the best
subset of edits from the union of all edits proposed by the GEC base systems.
However, we found that existing GEC quality estimation models are not good
enough in differentiating good corrections from bad ones, resulting in a low
F0.5 score when used for system combination. In this paper, we propose GRECO, a
new state-of-the-art quality estimation model that gives a better estimate of
the quality of a corrected sentence, as indicated by having a higher
correlation to the F0.5 score of a corrected sentence. It results in a combined
GEC system with a higher F0.5 score. We also propose three methods for
utilizing GEC quality estimation models for system combination with varying
generality: model-agnostic, model-agnostic with voting bias, and
model-dependent method. The combined GEC system outperforms the state of the
art on the CoNLL-2014 test set and the BEA-2019 test set, achieving the highest
F0.5 scores published to date.
Related papers
- Classifier Ensemble for Efficient Uncertainty Calibration of Deep Neural Networks for Image Classification [1.0649605625763086]
We evaluate both accuracy and calibration metrics, focusing on Expected Error (ECE) and Maximum Error (MCE)
Our work compares different methods for building simple yet efficient classifier ensembles, including majority voting and several metamodel-based approaches.
arXiv Detail & Related papers (2025-01-17T10:16:18Z) - DSGram: Dynamic Weighting Sub-Metrics for Grammatical Error Correction in the Era of Large Language Models [39.493913608472404]
Large language model (LLM)-based Grammatical Error Correction (GEC) models often produce corrections that diverge from provided gold references.
This discrepancy undermines the reliability of traditional reference-based evaluation metrics.
We propose a novel evaluation framework for GEC models, DSGram, integrating Semantic Coherence, Edit Level, and Fluency, and utilizing a dynamic weighting mechanism.
arXiv Detail & Related papers (2024-12-17T11:54:16Z) - Efficient and Interpretable Grammatical Error Correction with Mixture of Experts [33.748193858033346]
We propose a mixture-of-experts model, MoECE, for grammatical error correction.
Our model successfully achieves the performance of T5-XL with three times fewer effective parameters.
arXiv Detail & Related papers (2024-10-30T23:27:54Z) - LM-Combiner: A Contextual Rewriting Model for Chinese Grammatical Error Correction [49.0746090186582]
Over-correction is a critical problem in Chinese grammatical error correction (CGEC) task.
Recent work using model ensemble methods can effectively mitigate over-correction and improve the precision of the GEC system.
We propose the LM-Combiner, a rewriting model that can directly modify the over-correction of GEC system outputs without a model ensemble.
arXiv Detail & Related papers (2024-03-26T06:12:21Z) - RobustGEC: Robust Grammatical Error Correction Against Subtle Context
Perturbation [64.2568239429946]
We introduce RobustGEC, a benchmark designed to evaluate the context robustness of GEC systems.
We reveal that state-of-the-art GEC systems still lack sufficient robustness against context perturbations.
arXiv Detail & Related papers (2023-10-11T08:33:23Z) - Quality-Based Conditional Processing in Multi-Biometrics: Application to
Sensor Interoperability [63.05238390013457]
We describe and evaluate the ATVS-UAM fusion approach submitted to the quality-based evaluation of the 2007 BioSecure Multimodal Evaluation Campaign.
Our approach is based on linear logistic regression, in which fused scores tend to be log-likelihood-ratios.
Results show that the proposed approach outperforms all the rule-based fusion schemes.
arXiv Detail & Related papers (2022-11-24T12:11:22Z) - Investigation of Different Calibration Methods for Deep Speaker
Embedding based Verification Systems [66.61691401921296]
This paper presents an investigation over several methods of score calibration for deep speaker embedding extractors.
An additional focus of this research is to estimate the impact of score normalization on the calibration performance of the system.
arXiv Detail & Related papers (2022-03-28T21:22:22Z) - Neural Quality Estimation with Multiple Hypotheses for Grammatical Error
Correction [98.31440090585376]
Grammatical Error Correction (GEC) aims to correct writing errors and help language learners improve their writing skills.
Existing GEC models tend to produce spurious corrections or fail to detect lots of errors.
This paper presents the Neural Verification Network (VERNet) for GEC quality estimation with multiple hypotheses.
arXiv Detail & Related papers (2021-05-10T15:04:25Z) - A Self-Refinement Strategy for Noise Reduction in Grammatical Error
Correction [54.569707226277735]
Existing approaches for grammatical error correction (GEC) rely on supervised learning with manually created GEC datasets.
There is a non-negligible amount of "noise" where errors were inappropriately edited or left uncorrected.
We propose a self-refinement method where the key idea is to denoise these datasets by leveraging the prediction consistency of existing models.
arXiv Detail & Related papers (2020-10-07T04:45:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.