Ungrammatical-syntax-based In-context Example Selection for Grammatical Error Correction
- URL: http://arxiv.org/abs/2403.19283v1
- Date: Thu, 28 Mar 2024 10:05:57 GMT
- Title: Ungrammatical-syntax-based In-context Example Selection for Grammatical Error Correction
- Authors: Chenming Tang, Fanyi Qu, Yunfang Wu,
- Abstract summary: In this paper, we propose a novel ungrammatical-syntax-based in-context example selection strategy for grammatical error correction.
Specifically, we measure similarity of sentences based on their syntactic structures with diverse algorithms, and identify optimal ICL examples sharing the most similar ill-formed syntax to the test input.
- Score: 8.655807096424732
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the era of large language models (LLMs), in-context learning (ICL) stands out as an effective prompting strategy that explores LLMs' potency across various tasks. However, applying LLMs to grammatical error correction (GEC) is still a challenging task. In this paper, we propose a novel ungrammatical-syntax-based in-context example selection strategy for GEC. Specifically, we measure similarity of sentences based on their syntactic structures with diverse algorithms, and identify optimal ICL examples sharing the most similar ill-formed syntax to the test input. Additionally, we carry out a two-stage process to further improve the quality of selection results. On benchmark English GEC datasets, empirical results show that our proposed ungrammatical-syntax-based strategies outperform commonly-used word-matching or semantics-based methods with multiple LLMs. This indicates that for a syntax-oriented task like GEC, paying more attention to syntactic information can effectively boost LLMs' performance. Our code will be publicly available after the publication of this paper.
Related papers
- Enhancing Input-Label Mapping in In-Context Learning with Contrastive Decoding [71.01099784480597]
Large language models (LLMs) excel at a range of tasks through in-context learning (ICL)
We introduce In-Context Contrastive Decoding (ICCD), a novel method that emphasizes input-label mapping.
ICCD emphasizes input-label mapping by contrasting the output distributions between positive and negative in-context examples.
arXiv Detail & Related papers (2025-02-19T14:04:46Z) - Explanation based In-Context Demonstrations Retrieval for Multilingual Grammatical Error Correction [19.95974494301433]
Grammatical error correction (GEC) aims to correct grammatical, spelling, and semantic errors in natural language text.
We propose a novel retrieval method based on natural language grammatical error explanations (GEE)
Our method retrieves suitable few-shot demonstrations by matching the GEE of the test input with that of pre-constructed database samples.
arXiv Detail & Related papers (2025-02-12T15:41:43Z) - LLMCL-GEC: Advancing Grammatical Error Correction with LLM-Driven Curriculum Learning [44.010834543396165]
Large-scale language models (LLMs) have demonstrated remarkable capabilities in specific natural language processing (NLP) tasks.
However, they may still lack proficiency compared to specialized models in certain domains, such as grammatical error correction (GEC)
arXiv Detail & Related papers (2024-12-17T05:09:07Z) - PromptRefine: Enhancing Few-Shot Performance on Low-Resource Indic Languages with Example Selection from Related Example Banks [57.86928556668849]
Large Language Models (LLMs) have recently demonstrated impressive few-shot learning capabilities through in-context learning (ICL)
ICL performance is highly dependent on the choice of few-shot demonstrations, making the selection of the most optimal examples a persistent research challenge.
In this work, we propose PromptRefine, a novel Alternating Minimization approach for example selection that improves ICL performance on low-resource Indic languages.
arXiv Detail & Related papers (2024-12-07T17:51:31Z) - SCOI: Syntax-augmented Coverage-based In-context Example Selection for Machine Translation [13.87098305304058]
In this work, we introduce syntactic knowledge to select better in-context examples for machine translation (MT)
We propose a new strategy, namely Syntax-augmented COverage-based In-context example selection (SCOI)
Our proposed SCOI obtains the highest average COMET score among all learning-free methods.
arXiv Detail & Related papers (2024-08-09T05:25:17Z) - ParaICL: Towards Robust Parallel In-Context Learning [74.38022919598443]
Large language models (LLMs) have become the norm in natural language processing.
Few-shot in-context learning (ICL) relies on the choice of few-shot demonstration examples.
We propose a novel method named parallel in-context learning (ParaICL)
arXiv Detail & Related papers (2024-03-31T05:56:15Z) - Going Beyond Word Matching: Syntax Improves In-context Example Selection for Machine Translation [13.87098305304058]
In-context learning (ICL) is the trending prompting strategy in the era of large language models (LLMs)
Previous works on in-context example selection for machine translation (MT) focus on superficial word-level features.
We propose a syntax-based in-context example selection method for MT, by computing the syntactic similarity between dependency trees.
arXiv Detail & Related papers (2024-03-28T10:13:34Z) - Prompting open-source and commercial language models for grammatical
error correction of English learner text [19.192210777082053]
Large language models (LLMs) can be prompt to produce texts which are fluent and grammatical.
We evaluate how well LLMs can perform at grammatical error correction (GEC) by measuring their performance on established benchmark datasets.
We find that several open-source models outperform commercial ones on minimal edit benchmarks, and that in some settings zero-shot prompting is just as competitive as few-shot prompting.
arXiv Detail & Related papers (2024-01-15T14:19:47Z) - Which Syntactic Capabilities Are Statistically Learned by Masked
Language Models for Code? [51.29970742152668]
We highlight relying on accuracy-based measurements may lead to an overestimation of models' capabilities.
To address these issues, we introduce a technique called SyntaxEval in Syntactic Capabilities.
arXiv Detail & Related papers (2024-01-03T02:44:02Z) - kNN-ICL: Compositional Task-Oriented Parsing Generalization with Nearest
Neighbor In-Context Learning [50.40636157214161]
Task-Oriented Parsing (TOP) enables conversational assistants to interpret user commands expressed in natural language.
LLMs have achieved impressive performance in computer programs based on a natural language prompt.
This paper focuses on harnessing the capabilities of LLMs for semantic parsing tasks.
arXiv Detail & Related papers (2023-12-17T17:26:50Z) - A Unified Strategy for Multilingual Grammatical Error Correction with
Pre-trained Cross-Lingual Language Model [100.67378875773495]
We propose a generic and language-independent strategy for multilingual Grammatical Error Correction.
Our approach creates diverse parallel GEC data without any language-specific operations.
It achieves the state-of-the-art results on the NLPCC 2018 Task 2 dataset (Chinese) and obtains competitive performance on Falko-Merlin (German) and RULEC-GEC (Russian)
arXiv Detail & Related papers (2022-01-26T02:10:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.