EXCGEC: A Benchmark of Edit-wise Explainable Chinese Grammatical Error Correction
- URL: http://arxiv.org/abs/2407.00924v1
- Date: Mon, 1 Jul 2024 03:06:41 GMT
- Title: EXCGEC: A Benchmark of Edit-wise Explainable Chinese Grammatical Error Correction
- Authors: Jingheng Ye, Shang Qin, Yinghui Li, Xuxin Cheng, Libo Qin, Hai-Tao Zheng, Peng Xing, Zishan Xu, Guo Cheng, Zhao Wei,
- Abstract summary: This paper introduces the task of EXplainable GEC (EXGEC), which focuses on the integral role of both correction and explanation tasks.
We propose EXCGEC, a tailored benchmark for Chinese EXGEC consisting of 8,216 explanation-augmented samples.
- Score: 21.869368698234247
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing studies explore the explainability of Grammatical Error Correction (GEC) in a limited scenario, where they ignore the interaction between corrections and explanations. To bridge the gap, this paper introduces the task of EXplainable GEC (EXGEC), which focuses on the integral role of both correction and explanation tasks. To facilitate the task, we propose EXCGEC, a tailored benchmark for Chinese EXGEC consisting of 8,216 explanation-augmented samples featuring the design of hybrid edit-wise explanations. We benchmark several series of LLMs in multiple settings, covering post-explaining and pre-explaining. To promote the development of the task, we introduce a comprehensive suite of automatic metrics and conduct human evaluation experiments to demonstrate the human consistency of the automatic metrics for free-text explanations. All the codes and data will be released after the review.
Related papers
- Controlled Generation with Prompt Insertion for Natural Language
Explanations in Grammatical Error Correction [50.66922361766939]
It is crucial to ensure the user's comprehension of a reason for correction.
Existing studies present tokens, examples, and hints as to the basis for correction but do not directly explain the reasons for corrections.
Generating explanations for GEC corrections involves aligning input and output tokens, identifying correction points, and presenting corresponding explanations consistently.
This study introduces a method called controlled generation with Prompt Insertion (PI) so that LLMs can explain the reasons for corrections in natural language.
arXiv Detail & Related papers (2023-09-20T16:14:10Z) - XATU: A Fine-grained Instruction-based Benchmark for Explainable Text Updates [7.660511135287692]
This paper introduces XATU, the first benchmark specifically designed for fine-grained instruction-based explainable text editing.
XATU considers finer-grained text editing tasks of varying difficulty, incorporating lexical, syntactic, semantic, and knowledge-intensive edit aspects.
We demonstrate the effectiveness of instruction tuning and the impact of underlying architecture across various editing tasks.
arXiv Detail & Related papers (2023-09-20T04:58:59Z) - Instruction Position Matters in Sequence Generation with Large Language
Models [67.87516654892343]
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization.
We propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences.
arXiv Detail & Related papers (2023-08-23T12:36:57Z) - CLEME: Debiasing Multi-reference Evaluation for Grammatical Error
Correction [32.44051877804761]
Chunk-LEvel Multi-reference Evaluation (CLEME) is designed to evaluate Grammatical Error Correction (GEC) systems in the multi-reference evaluation setting.
We conduct experiments on six English reference sets based on the CoNLL-2014 shared task.
arXiv Detail & Related papers (2023-05-18T08:57:17Z) - Explanation Selection Using Unlabeled Data for Chain-of-Thought
Prompting [80.9896041501715]
Explanations that have not been "tuned" for a task, such as off-the-shelf explanations written by nonexperts, may lead to mediocre performance.
This paper tackles the problem of how to optimize explanation-infused prompts in a blackbox fashion.
arXiv Detail & Related papers (2023-02-09T18:02:34Z) - FCGEC: Fine-Grained Corpus for Chinese Grammatical Error Correction [6.116341682577877]
Grammatical Error Correction (GEC) has been broadly applied in automatic correction and proofreading system recently.
We present FCGEC, a fine-grained corpus to detect, identify and correct the grammatical errors.
arXiv Detail & Related papers (2022-10-22T06:29:05Z) - Improving Chinese Spelling Check by Character Pronunciation Prediction:
The Effects of Adaptivity and Granularity [76.20568599642799]
Chinese spelling check (CSC) is a fundamental NLP task that detects and corrects spelling errors in Chinese texts.
In this paper, we consider introducing an auxiliary task of Chinese pronunciation prediction ( CPP) to improve CSC.
We propose SCOPE which builds on top of a shared encoder two parallel decoders, one for the primary CSC task and the other for a fine-grained auxiliary CPP task.
arXiv Detail & Related papers (2022-10-20T03:42:35Z) - Improving Pre-trained Language Models with Syntactic Dependency
Prediction Task for Chinese Semantic Error Recognition [52.55136323341319]
Existing Chinese text error detection mainly focuses on spelling and simple grammatical errors.
Chinese semantic errors are understudied and more complex that humans cannot easily recognize.
arXiv Detail & Related papers (2022-04-15T13:55:32Z) - Generating Fluent Fact Checking Explanations with Unsupervised
Post-Editing [22.5444107755288]
We present an iterative edit-based algorithm that uses only phrase-level edits to perform unsupervised post-editing of ruling comments.
We show that our model generates explanations that are fluent, readable, non-redundant, and cover important information for the fact check.
arXiv Detail & Related papers (2021-12-13T15:31:07Z) - Towards Minimal Supervision BERT-based Grammar Error Correction [81.90356787324481]
We try to incorporate contextual information from pre-trained language model to leverage annotation and benefit multilingual scenarios.
Results show strong potential of Bidirectional Representations from Transformers (BERT) in grammatical error correction task.
arXiv Detail & Related papers (2020-01-10T15:45:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.