Related papers: Neural machine translation for automated feedback on children's early-stage writing

Neural machine translation for automated feedback on children's early-stage writing

URL: http://arxiv.org/abs/2311.09389v1
Date: Wed, 15 Nov 2023 21:32:44 GMT
Title: Neural machine translation for automated feedback on children's early-stage writing
Authors: Jonas Vestergaard Jensen, Mikkel Jordahn, Michael Riis Andersen
Abstract summary: We address the problem of assessing and constructing feedback for early-stage writing automatically using machine learning. We propose to use sequence-to-sequence models for "translating" early-stage writing by students into "conventional" writing.
Score: 3.0695550123017514
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this work, we address the problem of assessing and constructing feedback for early-stage writing automatically using machine learning. Early-stage writing is typically vastly different from conventional writing due to phonetic spelling and lack of proper grammar, punctuation, spacing etc. Consequently, early-stage writing is highly non-trivial to analyze using common linguistic metrics. We propose to use sequence-to-sequence models for "translating" early-stage writing by students into "conventional" writing, which allows the translated text to be analyzed using linguistic metrics. Furthermore, we propose a novel robust likelihood to mitigate the effect of noise in the dataset. We investigate the proposed methods using a set of numerical experiments and demonstrate that the conventional text can be predicted with high accuracy.

Related papers

Historical German Text Normalization Using Type- and Token-Based Language Modeling [0.0]
This report proposes a normalization system for German literary texts from c. 1700-1900, trained on a parallel corpus. The proposed system makes use of a machine learning approach using Transformer language models, combining an encoder-decoder model to normalize individual word types, and a pre-trained causal language model to adjust these normalizations within their context. An extensive evaluation shows that the proposed system provides state-of-the-art accuracy, comparable with a much larger fully end-to-end sentence-based normalization system, fine-tuning a pre-trained Transformer large language model.
arXiv Detail & Related papers (2024-09-04T16:14:05Z)
Fine-grained Controllable Text Generation through In-context Learning with Feedback [57.396980277089135]
We present a method for rewriting an input sentence to match specific values of nontrivial linguistic features, such as dependency depth. In contrast to earlier work, our method uses in-context learning rather than finetuning, making it applicable in use cases where data is sparse.
arXiv Detail & Related papers (2024-06-17T08:55:48Z)
Take the Hint: Improving Arabic Diacritization with Partially-Diacritized Text [4.863310073296471]
We propose 2SDiac, a multi-source model that can effectively support optional diacritics in input to inform all predictions. We also introduce Guided Learning, a training scheme to leverage given diacritics in input with different levels of random masking.
arXiv Detail & Related papers (2023-06-06T10:18:17Z)
Discontinuous Grammar as a Foreign Language [0.7412445894287709]
We extend the framework of sequence-to-sequence models for constituent parsing. We design several novelizations that can fully produce discontinuities. For the first time, we test a sequence-to-sequence model on the main discontinuous benchmarks.
arXiv Detail & Related papers (2021-10-20T08:58:02Z)
Long Text Generation by Modeling Sentence-Level and Discourse-Level Coherence [59.51720326054546]
We propose a long text generation model, which can represent the prefix sentences at sentence level and discourse level in the decoding process. Our model can generate more coherent texts than state-of-the-art baselines.
arXiv Detail & Related papers (2021-05-19T07:29:08Z)
Narrative Incoherence Detection [76.43894977558811]
We propose the task of narrative incoherence detection as a new arena for inter-sentential semantic understanding. Given a multi-sentence narrative, decide whether there exist any semantic discrepancies in the narrative flow.
arXiv Detail & Related papers (2020-12-21T07:18:08Z)
Curious Case of Language Generation Evaluation Metrics: A Cautionary Tale [52.663117551150954]
A few popular metrics remain as the de facto metrics to evaluate tasks such as image captioning and machine translation. This is partly due to ease of use, and partly because researchers expect to see them and know how to interpret them. In this paper, we urge the community for more careful consideration of how they automatically evaluate their models.
arXiv Detail & Related papers (2020-10-26T13:57:20Z)
Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes. An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences. The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z)
Cross-Thought for Sentence Encoder Pre-training [89.32270059777025]
Cross-Thought is a novel approach to pre-training sequence encoder. We train a Transformer-based sequence encoder over a large set of short sequences. Experiments on question answering and textual entailment tasks demonstrate that our pre-trained encoder can outperform state-of-the-art encoders.
arXiv Detail & Related papers (2020-10-07T21:02:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.