Error syntax aware augmentation of feedback comment generation dataset
- URL: http://arxiv.org/abs/2212.14293v1
- Date: Thu, 29 Dec 2022 12:57:23 GMT
- Title: Error syntax aware augmentation of feedback comment generation dataset
- Authors: Nikolay Babakov, Maria Lysyuk, Alexander Shvets, Lilya Kazakova,
Alexander Panchenko
- Abstract summary: This paper presents a solution to the GenChal 2022 shared task dedicated to feedback comment generation for writing learning.
In terms of this task given a text with an error and a span of the error, a system generates an explanatory note that helps the writer (language learner) to improve their writing skills.
- Score: 116.73173348201341
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a solution to the GenChal 2022 shared task dedicated to
feedback comment generation for writing learning. In terms of this task given a
text with an error and a span of the error, a system generates an explanatory
note that helps the writer (language learner) to improve their writing skills.
Our solution is based on fine-tuning the T5 model on the initial dataset
augmented according to syntactical dependencies of the words located within
indicated error span. The solution of our team "nigula" obtained second place
according to manual evaluation by the organizers.
Related papers
- GEE! Grammar Error Explanation with Large Language Models [64.16199533560017]
We propose the task of grammar error explanation, where a system needs to provide one-sentence explanations for each grammatical error in a pair of erroneous and corrected sentences.
We analyze the capability of GPT-4 in grammar error explanation, and find that it only produces explanations for 60.2% of the errors using one-shot prompting.
We develop a two-step pipeline that leverages fine-tuned and prompted large language models to perform structured atomic token edit extraction.
arXiv Detail & Related papers (2023-11-16T02:45:47Z) - Conjunct Resolution in the Face of Verbal Omissions [51.220650412095665]
We propose a conjunct resolution task that operates directly on the text and makes use of a split-and-rephrase paradigm in order to recover the missing elements in the coordination structure.
We curate a large dataset, containing over 10K examples of naturally-occurring verbal omissions with crowd-sourced annotations.
We train various neural baselines for this task, and show that while our best method obtains decent performance, it leaves ample space for improvement.
arXiv Detail & Related papers (2023-05-26T08:44:02Z) - Towards Fine-Grained Information: Identifying the Type and Location of
Translation Errors [80.22825549235556]
Existing approaches can not synchronously consider error position and type.
We build an FG-TED model to predict the textbf addition and textbfomission errors.
Experiments show that our model can identify both error type and position concurrently, and gives state-of-the-art results.
arXiv Detail & Related papers (2023-02-17T16:20:33Z) - Sentence-level Feedback Generation for English Language Learners: Does
Data Augmentation Help? [18.30408619963336]
Given a sentence and an error span, the task is to generate a feedback comment explaining the error.
We experiment with LLMs and also create multiple pseudo datasets for the task, investigating how it affects the performance of our system.
We present our results for the task along with extensive analysis of the generated comments with the aim of aiding future studies in feedback comment generation for English language learners.
arXiv Detail & Related papers (2022-12-18T03:53:44Z) - A Syntax-Guided Grammatical Error Correction Model with Dependency Tree
Correction [83.14159143179269]
Grammatical Error Correction (GEC) is a task of detecting and correcting grammatical errors in sentences.
We propose a syntax-guided GEC model (SG-GEC) which adopts the graph attention mechanism to utilize the syntactic knowledge of dependency trees.
We evaluate our model on public benchmarks of GEC task and it achieves competitive results.
arXiv Detail & Related papers (2021-11-05T07:07:48Z) - Learning to Describe Solutions for Bug Reports Based on Developer
Discussions [43.427873307255425]
We propose generating a concise natural language description of the solution by synthesizing relevant content within the discussion.
To support generating an informative description during an ongoing discussion, we propose a secondary task of determining when sufficient context about the solution emerges in real-time.
We construct a dataset for these tasks with a novel technique for obtaining noisy supervision from repository changes linked to bug reports.
arXiv Detail & Related papers (2021-10-08T19:39:55Z) - User-Initiated Repetition-Based Recovery in Multi-Utterance Dialogue
Systems [3.20350998499235]
We present a system that allows a user to correct speech recognition errors in a virtual assistant by repeating misunderstood words.
When a user repeats part of the phrase the system rewrites the original query to incorporate the correction.
We show that rewriting the original query is an effective way to handle repetition-based recovery.
arXiv Detail & Related papers (2021-08-02T23:32:13Z) - Civil Rephrases Of Toxic Texts With Self-Supervised Transformers [4.615338063719135]
This work focuses on models that can help suggest rephrasings of toxic comments in a more civil manner.
Inspired by recent progress in unpaired sequence-to-sequence tasks, a self-supervised learning model is introduced, called CAE-T5.
arXiv Detail & Related papers (2021-02-01T15:27:52Z) - The Paradigm Discovery Problem [121.79963594279893]
We formalize the paradigm discovery problem and develop metrics for judging systems.
We report empirical results on five diverse languages.
Our code and data are available for public use.
arXiv Detail & Related papers (2020-05-04T16:38:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.