CL$^2$GEC: A Multi-Discipline Benchmark for Continual Learning in Chinese Literature Grammatical Error Correction
- URL: http://arxiv.org/abs/2509.13672v1
- Date: Wed, 17 Sep 2025 03:54:52 GMT
- Title: CL$^2$GEC: A Multi-Discipline Benchmark for Continual Learning in Chinese Literature Grammatical Error Correction
- Authors: Shang Qin, Jingheng Ye, Yinghui Li, Hai-Tao Zheng, Qi Li, Jinxiao Shan, Zhixing Li, Hong-Gee Kim,
- Abstract summary: CL$2$GEC is the first Continual Learning benchmark for Chinese Literature Grammatical Error Correction.<n>Our benchmark includes 10,000 human-annotated sentences spanning 10 disciplines.<n>We evaluate large language models under sequential tuning, parameter-efficient adaptation, and four representative CL algorithms.
- Score: 28.004597594108514
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The growing demand for automated writing assistance in diverse academic domains highlights the need for robust Chinese Grammatical Error Correction (CGEC) systems that can adapt across disciplines. However, existing CGEC research largely lacks dedicated benchmarks for multi-disciplinary academic writing, overlooking continual learning (CL) as a promising solution to handle domain-specific linguistic variation and prevent catastrophic forgetting. To fill this crucial gap, we introduce CL$^2$GEC, the first Continual Learning benchmark for Chinese Literature Grammatical Error Correction, designed to evaluate adaptive CGEC across multiple academic fields. Our benchmark includes 10,000 human-annotated sentences spanning 10 disciplines, each exhibiting distinct linguistic styles and error patterns. CL$^2$GEC focuses on evaluating grammatical error correction in a continual learning setting, simulating sequential exposure to diverse academic disciplines to reflect real-world editorial dynamics. We evaluate large language models under sequential tuning, parameter-efficient adaptation, and four representative CL algorithms, using both standard GEC metrics and continual learning metrics adapted to task-level variation. Experimental results reveal that regularization-based methods mitigate forgetting more effectively than replay-based or naive sequential approaches. Our benchmark provides a rigorous foundation for future research in adaptive grammatical error correction across diverse academic domains.
Related papers
- COLA-GEC: A Bidirectional Framework for Enhancing Grammatical Acceptability and Error Correction [2.631955426232593]
This paper introduces COLA-GEC, a novel bidirectional framework that enhances both tasks through mutual knowledge transfer.<n>First, we augment grammatical acceptability models using GEC datasets, significantly improving their performance across multiple languages.<n>Second, we integrate grammatical acceptability signals into GEC model training via a dynamic loss function, effectively guiding corrections toward grammatically acceptable outputs.
arXiv Detail & Related papers (2025-07-16T03:29:05Z) - Chinese Grammatical Error Correction: A Survey [2.6914312267666705]
Chinese Grammatical Error Correction (CGEC) is a critical task in Natural Language Processing.<n>CGEC addresses the growing demand for automated writing assistance in both second-language (L2) and native (L1) Chinese writing.<n>This survey provides a comprehensive review of CGEC research, covering datasets, annotation schemes, evaluation methodologies, and system advancements.
arXiv Detail & Related papers (2025-04-01T17:14:50Z) - C-LLM: Learn to Check Chinese Spelling Errors Character by Character [61.53865964535705]
We propose C-LLM, a Large Language Model-based Chinese Spell Checking method that learns to check errors Character by Character.
C-LLM achieves an average improvement of 10% over existing methods.
arXiv Detail & Related papers (2024-06-24T11:16:31Z) - Improving Seq2Seq Grammatical Error Correction via Decoding
Interventions [40.52259641181596]
We propose a unified decoding intervention framework that employs an external critic to assess the appropriateness of the token to be generated incrementally.
We discover and investigate two types of critics: a pre-trained left-to-right language model critic and an incremental target-side grammatical error detector critic.
Our framework consistently outperforms strong baselines and achieves results competitive with state-of-the-art methods.
arXiv Detail & Related papers (2023-10-23T03:36:37Z) - CLEME: Debiasing Multi-reference Evaluation for Grammatical Error
Correction [32.44051877804761]
Chunk-LEvel Multi-reference Evaluation (CLEME) is designed to evaluate Grammatical Error Correction (GEC) systems in the multi-reference evaluation setting.
We conduct experiments on six English reference sets based on the CoNLL-2014 shared task.
arXiv Detail & Related papers (2023-05-18T08:57:17Z) - Linguistic Rules-Based Corpus Generation for Native Chinese Grammatical
Error Correction [36.74272211767197]
We propose a linguistic rules-based approach to construct large-scale CGEC training corpora with automatically generated grammatical errors.
We present a challenging CGEC benchmark derived entirely from errors made by native Chinese speakers in real-world scenarios.
arXiv Detail & Related papers (2022-10-19T10:20:39Z) - MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese
Grammatical Error Correction [51.3754092853434]
MuCGEC is a multi-reference evaluation dataset for Chinese Grammatical Error Correction (CGEC)
It consists of 7,063 sentences collected from three different Chinese-as-a-Second-Language (CSL) learner sources.
Each sentence has been corrected by three annotators, and their corrections are meticulously reviewed by an expert, resulting in 2.3 references per sentence.
arXiv Detail & Related papers (2022-04-23T05:20:38Z) - A Unified Strategy for Multilingual Grammatical Error Correction with
Pre-trained Cross-Lingual Language Model [100.67378875773495]
We propose a generic and language-independent strategy for multilingual Grammatical Error Correction.
Our approach creates diverse parallel GEC data without any language-specific operations.
It achieves the state-of-the-art results on the NLPCC 2018 Task 2 dataset (Chinese) and obtains competitive performance on Falko-Merlin (German) and RULEC-GEC (Russian)
arXiv Detail & Related papers (2022-01-26T02:10:32Z) - On Cross-Lingual Retrieval with Multilingual Text Encoders [51.60862829942932]
We study the suitability of state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks.
We benchmark their performance in unsupervised ad-hoc sentence- and document-level CLIR experiments.
We evaluate multilingual encoders fine-tuned in a supervised fashion (i.e., we learn to rank) on English relevance data in a series of zero-shot language and domain transfer CLIR experiments.
arXiv Detail & Related papers (2021-12-21T08:10:27Z) - LM-Critic: Language Models for Unsupervised Grammatical Error Correction [128.9174409251852]
We show how to leverage a pretrained language model (LM) in defining an LM-Critic, which judges a sentence to be grammatical.
We apply this LM-Critic and BIFI along with a large set of unlabeled sentences to bootstrap realistic ungrammatical / grammatical pairs for training a corrector.
arXiv Detail & Related papers (2021-09-14T17:06:43Z) - Improving the Efficiency of Grammatical Error Correction with Erroneous
Span Detection and Correction [106.63733511672721]
We propose a novel language-independent approach to improve the efficiency for Grammatical Error Correction (GEC) by dividing the task into two subtasks: Erroneous Span Detection ( ESD) and Erroneous Span Correction (ESC)
ESD identifies grammatically incorrect text spans with an efficient sequence tagging model. ESC leverages a seq2seq model to take the sentence with annotated erroneous spans as input and only outputs the corrected text for these spans.
Experiments show our approach performs comparably to conventional seq2seq approaches in both English and Chinese GEC benchmarks with less than 50% time cost for inference.
arXiv Detail & Related papers (2020-10-07T08:29:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.