GEC-DePenD: Non-Autoregressive Grammatical Error Correction with
  Decoupled Permutation and Decoding
        - URL: http://arxiv.org/abs/2311.08191v1
- Date: Tue, 14 Nov 2023 14:24:36 GMT
- Title: GEC-DePenD: Non-Autoregressive Grammatical Error Correction with
  Decoupled Permutation and Decoding
- Authors: Konstantin Yakovlev, Alexander Podolskiy, Andrey Bout, Sergey
  Nikolenko, Irina Piontkovskaya
- Abstract summary: Grammatical error correction (GEC) is an important NLP task that is usually solved with autoregressive sequence-to-sequence models.
We propose a novel non-autoregressive approach to GEC that decouples the architecture into a permutation network.
We show that the resulting network improves over previously known non-autoregressive methods for GEC.
- Score: 52.14832976759585
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Grammatical error correction (GEC) is an important NLP task that is currently
usually solved with autoregressive sequence-to-sequence models. However,
approaches of this class are inherently slow due to one-by-one token
generation, so non-autoregressive alternatives are needed. In this work, we
propose a novel non-autoregressive approach to GEC that decouples the
architecture into a permutation network that outputs a self-attention weight
matrix that can be used in beam search to find the best permutation of input
tokens (with auxiliary {ins} tokens) and a decoder network based on a
step-unrolled denoising autoencoder that fills in specific tokens. This allows
us to find the token permutation after only one forward pass of the permutation
network, avoiding autoregressive constructions. We show that the resulting
network improves over previously known non-autoregressive methods for GEC and
reaches the level of autoregressive methods that do not use language-specific
synthetic data generation methods. Our results are supported by a comprehensive
experimental validation on the ConLL-2014 and Write&Improve+LOCNESS datasets
and an extensive ablation study that supports our architectural and algorithmic
choices.
 
      
        Related papers
        - SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling   with Backtracking [60.109453252858806]
 A maximum-likelihood (MLE) objective does not match a downstream use-case of autoregressively generating high-quality sequences.
We formulate sequence generation as an imitation learning (IL) problem.
This allows us to minimize a variety of divergences between the distribution of sequences generated by an autoregressive model and sequences from a dataset.
Our resulting method, SequenceMatch, can be implemented without adversarial training or architectural changes.
 arXiv  Detail & Related papers  (2023-06-08T17:59:58Z)
- Permutation-Invariant Set Autoencoders with Fixed-Size Embeddings for
  Multi-Agent Learning [7.22614468437919]
 We introduce a Permutation-Invariant Set Autoencoder (PISA)
PISA produces encodings with significantly lower reconstruction error than existing baselines.
We demonstrate its usefulness in a multi-agent application.
 arXiv  Detail & Related papers  (2023-02-24T18:59:13Z)
- Autoregressive Search Engines: Generating Substrings as Document
  Identifiers [53.0729058170278]
 Autoregressive language models are emerging as the de-facto standard for generating answers.
Previous work has explored ways to partition the search space into hierarchical structures.
In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers.
 arXiv  Detail & Related papers  (2022-04-22T10:45:01Z)
- Step-unrolled Denoising Autoencoders for Text Generation [17.015573262373742]
 We propose a new generative model of text, Step-unrolled Denoising Autoencoder (SUNDAE)
SUNDAE is repeatedly applied on a sequence of tokens, starting from random inputs and improving them each time until convergence.
We present a simple new improvement operator that converges in fewer iterations than diffusion methods.
 arXiv  Detail & Related papers  (2021-12-13T16:00:33Z)
- Discovering Non-monotonic Autoregressive Orderings with Variational
  Inference [67.27561153666211]
 We develop an unsupervised parallelizable learner that discovers high-quality generation orders purely from training data.
We implement the encoder as a Transformer with non-causal attention that outputs permutations in one forward pass.
 Empirical results in language modeling tasks demonstrate that our method is context-aware and discovers orderings that are competitive with or even better than fixed orders.
 arXiv  Detail & Related papers  (2021-10-27T16:08:09Z)
- Highly Parallel Autoregressive Entity Linking with Discriminative
  Correction [51.947280241185]
 We propose a very efficient approach that parallelizes autoregressive linking across all potential mentions.
Our model is >70 times faster and more accurate than the previous generative method.
 arXiv  Detail & Related papers  (2021-09-08T17:28:26Z)
- Don't Take It Literally: An Edit-Invariant Sequence Loss for Text
  Generation [109.46348908829697]
 We propose a novel Edit-Invariant Sequence Loss (EISL), which computes the matching loss of a target n-gram with all n-grams in the generated sequence.
We conduct experiments on three tasks: machine translation with noisy target sequences, unsupervised text style transfer, and non-autoregressive machine translation.
 arXiv  Detail & Related papers  (2021-06-29T03:59:21Z)
- SparseGAN: Sparse Generative Adversarial Network for Text Generation [8.634962333084724]
 We propose a SparseGAN that generates semantic-interpretable, but sparse sentence representations as inputs to the discriminator.
With such semantic-rich representations, we not only reduce unnecessary noises for efficient adversarial training, but also make the entire training process fully differentiable.
 arXiv  Detail & Related papers  (2021-03-22T04:44:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.