Related papers: System Combination via Quality Estimation for Grammatical Error Correction

System Combination via Quality Estimation for Grammatical Error Correction

URL: http://arxiv.org/abs/2310.14947v1
Date: Mon, 23 Oct 2023 13:46:49 GMT
Title: System Combination via Quality Estimation for Grammatical Error Correction
Authors: Muhammad Reza Qorib and Hwee Tou Ng
Abstract summary: We propose GRECO, a new state-of-the-art quality estimation model that gives a better estimate of the quality of a corrected sentence. We also propose three methods for utilizing GEC quality estimation models for system combination with varying generality.
Score: 29.91720235173108
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Quality estimation models have been developed to assess the corrections made by grammatical error correction (GEC) models when the reference or gold-standard corrections are not available. An ideal quality estimator can be utilized to combine the outputs of multiple GEC systems by choosing the best subset of edits from the union of all edits proposed by the GEC base systems. However, we found that existing GEC quality estimation models are not good enough in differentiating good corrections from bad ones, resulting in a low F0.5 score when used for system combination. In this paper, we propose GRECO, a new state-of-the-art quality estimation model that gives a better estimate of the quality of a corrected sentence, as indicated by having a higher correlation to the F0.5 score of a corrected sentence. It results in a combined GEC system with a higher F0.5 score. We also propose three methods for utilizing GEC quality estimation models for system combination with varying generality: model-agnostic, model-agnostic with voting bias, and model-dependent method. The combined GEC system outperforms the state of the art on the CoNLL-2014 test set and the BEA-2019 test set, achieving the highest F0.5 scores published to date.

Related papers

Classifier Ensemble for Efficient Uncertainty Calibration of Deep Neural Networks for Image Classification [1.0649605625763086]
We evaluate both accuracy and calibration metrics, focusing on Expected Error (ECE) and Maximum Error (MCE) Our work compares different methods for building simple yet efficient classifier ensembles, including majority voting and several metamodel-based approaches.
arXiv Detail & Related papers (2025-01-17T10:16:18Z)
DSGram: Dynamic Weighting Sub-Metrics for Grammatical Error Correction in the Era of Large Language Models [39.493913608472404]
Large language model (LLM)-based Grammatical Error Correction (GEC) models often produce corrections that diverge from provided gold references. This discrepancy undermines the reliability of traditional reference-based evaluation metrics. We propose a novel evaluation framework for GEC models, DSGram, integrating Semantic Coherence, Edit Level, and Fluency, and utilizing a dynamic weighting mechanism.
arXiv Detail & Related papers (2024-12-17T11:54:16Z)
Efficient and Interpretable Grammatical Error Correction with Mixture of Experts [33.748193858033346]
We propose a mixture-of-experts model, MoECE, for grammatical error correction. Our model successfully achieves the performance of T5-XL with three times fewer effective parameters.
arXiv Detail & Related papers (2024-10-30T23:27:54Z)
GETS: Ensemble Temperature Scaling for Calibration in Graph Neural Networks [8.505932176266368]
Graph Neural Networks deliver strong classification results but often suffer from poor calibration performance, leading to overconfidence or underconfidence. Existing post hoc methods, such as temperature scaling, fail to effectively utilize graph structures, while current GNN calibration methods often overlook the potential of leveraging diverse input information and model ensembles jointly. In the paper, we propose Graph Ensemble TemperatureScaling, a novel calibration framework that combines input and model ensemble strategies within a Graph Mixture of Experts archi SOTA calibration techniques, reducing expected calibration error by 25 percent across 10 GNN benchmark datasets.
arXiv Detail & Related papers (2024-10-12T15:34:41Z)
LM-Combiner: A Contextual Rewriting Model for Chinese Grammatical Error Correction [49.0746090186582]
Over-correction is a critical problem in Chinese grammatical error correction (CGEC) task. Recent work using model ensemble methods can effectively mitigate over-correction and improve the precision of the GEC system. We propose the LM-Combiner, a rewriting model that can directly modify the over-correction of GEC system outputs without a model ensemble.
arXiv Detail & Related papers (2024-03-26T06:12:21Z)
GMC-IQA: Exploiting Global-correlation and Mean-opinion Consistency for No-reference Image Quality Assessment [40.33163764161929]
We construct a novel loss function and network to exploit Global-correlation and Mean-opinion Consistency. We propose a novel GCC loss by defining a pairwise preference-based rank estimation to solve the non-differentiable problem of SROCC. We also propose a mean-opinion network, which integrates diverse opinion features to alleviate the randomness of weight learning.
arXiv Detail & Related papers (2024-01-19T06:03:01Z)
RobustGEC: Robust Grammatical Error Correction Against Subtle Context Perturbation [64.2568239429946]
We introduce RobustGEC, a benchmark designed to evaluate the context robustness of GEC systems. We reveal that state-of-the-art GEC systems still lack sufficient robustness against context perturbations.
arXiv Detail & Related papers (2023-10-11T08:33:23Z)
Quality-Based Conditional Processing in Multi-Biometrics: Application to Sensor Interoperability [63.05238390013457]
We describe and evaluate the ATVS-UAM fusion approach submitted to the quality-based evaluation of the 2007 BioSecure Multimodal Evaluation Campaign. Our approach is based on linear logistic regression, in which fused scores tend to be log-likelihood-ratios. Results show that the proposed approach outperforms all the rule-based fusion schemes.
arXiv Detail & Related papers (2022-11-24T12:11:22Z)
Investigation of Different Calibration Methods for Deep Speaker Embedding based Verification Systems [66.61691401921296]
This paper presents an investigation over several methods of score calibration for deep speaker embedding extractors. An additional focus of this research is to estimate the impact of score normalization on the calibration performance of the system.
arXiv Detail & Related papers (2022-03-28T21:22:22Z)
Neural Quality Estimation with Multiple Hypotheses for Grammatical Error Correction [98.31440090585376]
Grammatical Error Correction (GEC) aims to correct writing errors and help language learners improve their writing skills. Existing GEC models tend to produce spurious corrections or fail to detect lots of errors. This paper presents the Neural Verification Network (VERNet) for GEC quality estimation with multiple hypotheses.
arXiv Detail & Related papers (2021-05-10T15:04:25Z)
A Self-Refinement Strategy for Noise Reduction in Grammatical Error Correction [54.569707226277735]
Existing approaches for grammatical error correction (GEC) rely on supervised learning with manually created GEC datasets. There is a non-negligible amount of "noise" where errors were inappropriately edited or left uncorrected. We propose a self-refinement method where the key idea is to denoise these datasets by leveraging the prediction consistency of existing models.
arXiv Detail & Related papers (2020-10-07T04:45:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.