Leveraging Denoised Abstract Meaning Representation for Grammatical
Error Correction
- URL: http://arxiv.org/abs/2307.02127v1
- Date: Wed, 5 Jul 2023 09:06:56 GMT
- Title: Leveraging Denoised Abstract Meaning Representation for Grammatical
Error Correction
- Authors: Hejing Cao and Dongyan Zhao
- Abstract summary: Grammatical Error Correction (GEC) is the task of correcting errorful sentences into grammatically correct, semantically consistent, and coherent sentences.
We propose the AMR-GEC, a seq-to-seq model that incorporates denoised AMR as additional knowledge.
- Score: 53.55440811942249
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Grammatical Error Correction (GEC) is the task of correcting errorful
sentences into grammatically correct, semantically consistent, and coherent
sentences. Popular GEC models either use large-scale synthetic corpora or use a
large number of human-designed rules. The former is costly to train, while the
latter requires quite a lot of human expertise. In recent years, AMR, a
semantic representation framework, has been widely used by many natural
language tasks due to its completeness and flexibility. A non-negligible
concern is that AMRs of grammatically incorrect sentences may not be exactly
reliable. In this paper, we propose the AMR-GEC, a seq-to-seq model that
incorporates denoised AMR as additional knowledge. Specifically, We design a
semantic aggregated GEC model and explore denoising methods to get AMRs more
reliable. Experiments on the BEA-2019 shared task and the CoNLL-2014 shared
task have shown that AMR-GEC performs comparably to a set of strong baselines
with a large number of synthetic data. Compared with the T5 model with
synthetic data, AMR-GEC can reduce the training time by 32\% while inference
time is comparable. To the best of our knowledge, we are the first to
incorporate AMR for grammatical error correction.
Related papers
- Analyzing the Role of Semantic Representations in the Era of Large Language Models [104.18157036880287]
We investigate the role of semantic representations in the era of large language models (LLMs)
We propose an AMR-driven chain-of-thought prompting method, which we call AMRCoT.
We find that it is difficult to predict which input examples AMR may help or hurt on, but errors tend to arise with multi-word expressions.
arXiv Detail & Related papers (2024-05-02T17:32:59Z) - RobustGEC: Robust Grammatical Error Correction Against Subtle Context
Perturbation [64.2568239429946]
We introduce RobustGEC, a benchmark designed to evaluate the context robustness of GEC systems.
We reveal that state-of-the-art GEC systems still lack sufficient robustness against context perturbations.
arXiv Detail & Related papers (2023-10-11T08:33:23Z) - An AMR-based Link Prediction Approach for Document-level Event Argument
Extraction [51.77733454436013]
Recent works have introduced Abstract Meaning Representation (AMR) for Document-level Event Argument Extraction (Doc-level EAE)
This work reformulates EAE as a link prediction problem on AMR graphs.
We propose a novel graph structure, Tailored AMR Graph (TAG), which compresses less informative subgraphs and edge types, integrates span information, and highlights surrounding events in the same document.
arXiv Detail & Related papers (2023-05-30T16:07:48Z) - A Syntax-Guided Grammatical Error Correction Model with Dependency Tree
Correction [83.14159143179269]
Grammatical Error Correction (GEC) is a task of detecting and correcting grammatical errors in sentences.
We propose a syntax-guided GEC model (SG-GEC) which adopts the graph attention mechanism to utilize the syntactic knowledge of dependency trees.
We evaluate our model on public benchmarks of GEC task and it achieves competitive results.
arXiv Detail & Related papers (2021-11-05T07:07:48Z) - LM-Critic: Language Models for Unsupervised Grammatical Error Correction [128.9174409251852]
We show how to leverage a pretrained language model (LM) in defining an LM-Critic, which judges a sentence to be grammatical.
We apply this LM-Critic and BIFI along with a large set of unlabeled sentences to bootstrap realistic ungrammatical / grammatical pairs for training a corrector.
arXiv Detail & Related papers (2021-09-14T17:06:43Z) - Probabilistic, Structure-Aware Algorithms for Improved Variety,
Accuracy, and Coverage of AMR Alignments [9.74672460306765]
We present algorithms for aligning components of Abstract Meaning Representation (AMR) spans in English sentences.
We leverage unsupervised learning in combination with graphs, taking the best of both worlds from previous AMR.
Our approach covers a wider variety of AMR substructures than previously considered, achieves higher coverage of nodes and edges, and does so with higher accuracy.
arXiv Detail & Related papers (2021-06-10T18:46:32Z) - Synthetic Data Generation for Grammatical Error Correction with Tagged
Corruption Models [15.481446439370343]
We use error type tags from automatic annotation tools such as ERRANT to guide synthetic data generation.
We build a new, large synthetic pre-training data set with error tag frequency distributions matching a given development set.
Our approach is particularly effective in adapting a GEC system, trained on mixed native and non-native English, to a native English test set.
arXiv Detail & Related papers (2021-05-27T17:17:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.