Related papers: GEE! Grammar Error Explanation with Large Language Models

GEE! Grammar Error Explanation with Large Language Models

URL: http://arxiv.org/abs/2311.09517v1
Date: Thu, 16 Nov 2023 02:45:47 GMT
Title: GEE! Grammar Error Explanation with Large Language Models
Authors: Yixiao Song, Kalpesh Krishna, Rajesh Bhatt, Kevin Gimpel, Mohit Iyyer
Abstract summary: We propose the task of grammar error explanation, where a system needs to provide one-sentence explanations for each grammatical error in a pair of erroneous and corrected sentences. We analyze the capability of GPT-4 in grammar error explanation, and find that it only produces explanations for 60.2% of the errors using one-shot prompting. We develop a two-step pipeline that leverages fine-tuned and prompted large language models to perform structured atomic token edit extraction.
Score: 64.16199533560017
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Grammatical error correction tools are effective at correcting grammatical errors in users' input sentences but do not provide users with \textit{natural language} explanations about their errors. Such explanations are essential for helping users learn the language by gaining a deeper understanding of its grammatical rules (DeKeyser, 2003; Ellis et al., 2006). To address this gap, we propose the task of grammar error explanation, where a system needs to provide one-sentence explanations for each grammatical error in a pair of erroneous and corrected sentences. We analyze the capability of GPT-4 in grammar error explanation, and find that it only produces explanations for 60.2% of the errors using one-shot prompting. To improve upon this performance, we develop a two-step pipeline that leverages fine-tuned and prompted large language models to perform structured atomic token edit extraction, followed by prompting GPT-4 to generate explanations. We evaluate our pipeline on German and Chinese grammar error correction data sampled from language learners with a wide range of proficiency levels. Human evaluation reveals that our pipeline produces 93.9% and 98.0% correct explanations for German and Chinese data, respectively. To encourage further research in this area, we will open-source our data and code.

Related papers

Leveraging Prompt-Tuning for Bengali Grammatical Error Explanation Using Large Language Models [0.0]
We propose a novel three-step prompt-tuning method for Bengali Grammatical Error Explanation (BGEE) Our approach involves identifying and categorizing grammatical errors in Bengali sentences, generating corrected versions of the sentences, and providing natural language explanations for each identified error. We evaluate the performance of our BGEE system using both automated evaluation metrics and human evaluation conducted by experienced Bengali language experts.
arXiv Detail & Related papers (2025-04-08T03:38:01Z)
Tgea: An error-annotated dataset and benchmark tasks for text generation from pretrained language models [57.758735361535486]
TGEA is an error-annotated dataset for text generation from pretrained language models (PLMs) We create an error taxonomy to cover 24 types of errors occurring in PLM-generated sentences. This is the first dataset with comprehensive annotations for PLM-generated texts.
arXiv Detail & Related papers (2025-03-06T09:14:02Z)
How Ready Are Generative Pre-trained Large Language Models for Explaining Bengali Grammatical Errors? [0.4857223913212445]
Grammatical error correction (GEC) tools, powered by advanced generative artificial intelligence (AI), competently correct linguistic inaccuracies in user input. However, they often fall short in providing essential natural language explanations. In such languages, grammatical error explanation (GEE) systems should not only correct sentences but also provide explanations for errors.
arXiv Detail & Related papers (2024-05-27T15:56:45Z)
GrammarGPT: Exploring Open-Source LLMs for Native Chinese Grammatical Error Correction with Supervised Fine-Tuning [46.75740002185691]
We introduce GrammarGPT, an open-source Large Language Model, to explore its potential for native Chinese grammatical error correction. For grammatical errors with clues, we proposed a method to guide ChatGPT to generate ungrammatical sentences by providing those clues. For grammatical errors without clues, we collected ungrammatical sentences from publicly available websites and manually corrected them.
arXiv Detail & Related papers (2023-07-26T02:45:38Z)
Enhancing Grammatical Error Correction Systems with Explanations [45.69642286275681]
Grammatical error correction systems improve written communication by detecting and correcting language mistakes. We introduce EXPECT, a dataset annotated with evidence words and grammatical error types. Human evaluation verifies our explainable GEC system's explanations can assist second-language learners in determining whether to accept a correction suggestion.
arXiv Detail & Related papers (2023-05-25T03:00:49Z)
A Syntax-Guided Grammatical Error Correction Model with Dependency Tree Correction [83.14159143179269]
Grammatical Error Correction (GEC) is a task of detecting and correcting grammatical errors in sentences. We propose a syntax-guided GEC model (SG-GEC) which adopts the graph attention mechanism to utilize the syntactic knowledge of dependency trees. We evaluate our model on public benchmarks of GEC task and it achieves competitive results.
arXiv Detail & Related papers (2021-11-05T07:07:48Z)
Exploring the Capacity of a Large-scale Masked Language Model to Recognize Grammatical Errors [3.55517579369797]
We show that 5 to 10% of training data are enough for a BERT-based error detection method to achieve performance equivalent to a non-language model-based method. We also show with pseudo error data that it actually exhibits such nice properties in learning rules for recognizing various types of error.
arXiv Detail & Related papers (2021-08-27T10:37:14Z)
Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction [106.63733511672721]
We propose a novel language-independent approach to improve the efficiency for Grammatical Error Correction (GEC) by dividing the task into two subtasks: Erroneous Span Detection ( ESD) and Erroneous Span Correction (ESC) ESD identifies grammatically incorrect text spans with an efficient sequence tagging model. ESC leverages a seq2seq model to take the sentence with annotated erroneous spans as input and only outputs the corrected text for these spans. Experiments show our approach performs comparably to conventional seq2seq approaches in both English and Chinese GEC benchmarks with less than 50% time cost for inference.
arXiv Detail & Related papers (2020-10-07T08:29:11Z)
On the Robustness of Language Encoders against Grammatical Errors [66.05648604987479]
We collect real grammatical errors from non-native speakers and conduct adversarial attacks to simulate these errors on clean text data. Results confirm that the performance of all tested models is affected but the degree of impact varies.
arXiv Detail & Related papers (2020-05-12T11:01:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.