GPT-3.5 for Grammatical Error Correction
- URL: http://arxiv.org/abs/2405.08469v1
- Date: Tue, 14 May 2024 09:51:09 GMT
- Title: GPT-3.5 for Grammatical Error Correction
- Authors: Anisia Katinskaia, Roman Yangarber,
- Abstract summary: This paper investigates the application of GPT-3.5 for Grammatical Error Correction (GEC) in multiple languages.
We conduct automatic evaluations of the corrections proposed by GPT-3.5 using several methods.
For English, GPT-3.5 demonstrates high recall, generates fluent corrections, and generally preserves sentence semantics.
But, human evaluation for both English and Russian reveals that, despite its strong error-detection capabilities, GPT-3.5 struggles with several error types.
- Score: 0.4757470449749875
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper investigates the application of GPT-3.5 for Grammatical Error Correction (GEC) in multiple languages in several settings: zero-shot GEC, fine-tuning for GEC, and using GPT-3.5 to re-rank correction hypotheses generated by other GEC models. In the zero-shot setting, we conduct automatic evaluations of the corrections proposed by GPT-3.5 using several methods: estimating grammaticality with language models (LMs), the Scribendi test, and comparing the semantic embeddings of sentences. GPT-3.5 has a known tendency to over-correct erroneous sentences and propose alternative corrections. For several languages, such as Czech, German, Russian, Spanish, and Ukrainian, GPT-3.5 substantially alters the source sentences, including their semantics, which presents significant challenges for evaluation with reference-based metrics. For English, GPT-3.5 demonstrates high recall, generates fluent corrections, and generally preserves sentence semantics. However, human evaluation for both English and Russian reveals that, despite its strong error-detection capabilities, GPT-3.5 struggles with several error types, including punctuation mistakes, tense errors, syntactic dependencies between words, and lexical compatibility at the sentence level.
Related papers
- Revisiting Meta-evaluation for Grammatical Error Correction [14.822205658480813]
SEEDA is a new dataset for GEC meta-evaluation.
It consists of corrections with human ratings along two different granularities.
The results suggest that edit-based metrics may have been underestimated in existing studies.
arXiv Detail & Related papers (2024-03-05T05:53:09Z) - An Analysis of Language Frequency and Error Correction for Esperanto [0.0]
We conduct a comprehensive frequency analysis using the Eo-GP dataset.
We then introduce the Eo-GEC dataset, derived from authentic user cases.
Using GPT-3.5 and GPT-4, our experiments show that GPT-4 outperforms GPT-3.5 in both automated and human evaluations.
arXiv Detail & Related papers (2024-02-15T04:10:25Z) - Enhancing conversational quality in language learning chatbots: An
evaluation of GPT4 for ASR error correction [20.465220855548292]
This paper explores the use of GPT4 for ASR error correction in conversational settings.
We find that transcriptions corrected by GPT4 lead to higher conversation quality, despite an increase in WER.
arXiv Detail & Related papers (2023-07-19T04:25:21Z) - Is ChatGPT a Highly Fluent Grammatical Error Correction System? A
Comprehensive Evaluation [41.94480044074273]
ChatGPT is a large-scale language model based on the advanced GPT-3.5 architecture.
We design zero-shot chain-of-thought (CoT) and few-shot CoT settings using in-context learning for ChatGPT.
Our evaluation involves assessing ChatGPT's performance on five official test sets in three different languages, along with three document-level GEC test sets in English.
arXiv Detail & Related papers (2023-04-04T12:33:40Z) - Analyzing the Performance of GPT-3.5 and GPT-4 in Grammatical Error
Correction [28.58384091374763]
GPT-3 and GPT-4 models are powerful, achieving high performance on a variety of Natural Language Processing tasks.
We perform experiments testing the capabilities of a GPT-3.5 model (text-davinci-003) and a GPT-4 model (gpt-4-0314) on major GEC benchmarks.
We report the performance of our best prompt on the BEA-2019 and JFLEG datasets, finding that the GPT models can perform well in a sentence-level revision setting.
arXiv Detail & Related papers (2023-03-25T03:08:49Z) - CLSE: Corpus of Linguistically Significant Entities [58.29901964387952]
We release a Corpus of Linguistically Significant Entities (CLSE) annotated by experts.
CLSE covers 74 different semantic types to support various applications from airline ticketing to video games.
We create a linguistically representative NLG evaluation benchmark in three languages: French, Marathi, and Russian.
arXiv Detail & Related papers (2022-11-04T12:56:12Z) - Czech Grammar Error Correction with a Large and Diverse Corpus [64.94696028072698]
We introduce a large and diverse Czech corpus annotated for grammatical error correction (GEC)
The Grammar Error Correction Corpus for Czech (GECCC) offers a variety of four domains, covering error distributions ranging from high error density essays written by non-native speakers, to website texts.
We compare several Czech GEC systems, including several Transformer-based ones, setting a strong baseline to future research.
arXiv Detail & Related papers (2022-01-14T18:20:47Z) - A Syntax-Guided Grammatical Error Correction Model with Dependency Tree
Correction [83.14159143179269]
Grammatical Error Correction (GEC) is a task of detecting and correcting grammatical errors in sentences.
We propose a syntax-guided GEC model (SG-GEC) which adopts the graph attention mechanism to utilize the syntactic knowledge of dependency trees.
We evaluate our model on public benchmarks of GEC task and it achieves competitive results.
arXiv Detail & Related papers (2021-11-05T07:07:48Z) - LM-Critic: Language Models for Unsupervised Grammatical Error Correction [128.9174409251852]
We show how to leverage a pretrained language model (LM) in defining an LM-Critic, which judges a sentence to be grammatical.
We apply this LM-Critic and BIFI along with a large set of unlabeled sentences to bootstrap realistic ungrammatical / grammatical pairs for training a corrector.
arXiv Detail & Related papers (2021-09-14T17:06:43Z) - Neural Quality Estimation with Multiple Hypotheses for Grammatical Error
Correction [98.31440090585376]
Grammatical Error Correction (GEC) aims to correct writing errors and help language learners improve their writing skills.
Existing GEC models tend to produce spurious corrections or fail to detect lots of errors.
This paper presents the Neural Verification Network (VERNet) for GEC quality estimation with multiple hypotheses.
arXiv Detail & Related papers (2021-05-10T15:04:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.