Felix: Flexible Text Editing Through Tagging and Insertion
- URL: http://arxiv.org/abs/2003.10687v1
- Date: Tue, 24 Mar 2020 07:01:09 GMT
- Title: Felix: Flexible Text Editing Through Tagging and Insertion
- Authors: Jonathan Mallinson, Aliaksei Severyn, Eric Malmi, Guillermo Garrido
- Abstract summary: Felix is a flexible text-editing approach for generation, designed to derive the maximum benefit from the ideas of decoding with bi-directional contexts and self-supervised pre-training.
We achieve this by decomposing the text-editing task into two sub-tasks: tagging to decide on the subset of input tokens and their order in the output text and insertion to in-fill the missing tokens in the output not present in the input.
- Score: 21.55417495142206
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present Felix --- a flexible text-editing approach for generation,
designed to derive the maximum benefit from the ideas of decoding with
bi-directional contexts and self-supervised pre-training. In contrast to
conventional sequence-to-sequence (seq2seq) models, Felix is efficient in
low-resource settings and fast at inference time, while being capable of
modeling flexible input-output transformations. We achieve this by decomposing
the text-editing task into two sub-tasks: tagging to decide on the subset of
input tokens and their order in the output text and insertion to in-fill the
missing tokens in the output not present in the input. The tagging model
employs a novel Pointer mechanism, while the insertion model is based on a
Masked Language Model. Both of these models are chosen to be non-autoregressive
to guarantee faster inference. Felix performs favourably when compared to
recent text-editing methods and strong seq2seq baselines when evaluated on four
NLG tasks: Sentence Fusion, Machine Translation Automatic Post-Editing,
Summarization, and Text Simplification.
Related papers
- Hierarchical Phrase-based Sequence-to-Sequence Learning [94.10257313923478]
We describe a neural transducer that maintains the flexibility of standard sequence-to-sequence (seq2seq) models while incorporating hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference.
Our approach trains two models: a discriminative derivation based on a bracketing grammar whose tree hierarchically aligns source and target phrases, and a neural seq2seq model that learns to translate the aligned phrases one-by-one.
arXiv Detail & Related papers (2022-11-15T05:22:40Z) - Text Editing as Imitation Game [33.418628166176234]
We reformulate text editing as an imitation game using behavioral cloning.
We introduce a dual decoders structure to parallel the decoding while retaining the dependencies between action tokens.
Our model consistently outperforms the autoregressive baselines in terms of performance, efficiency, and robustness.
arXiv Detail & Related papers (2022-10-21T22:07:04Z) - Collocation2Text: Controllable Text Generation from Guide Phrases in
Russian [0.0]
Collocation2Text is a plug-and-play method for automatic controllable text generation in Russian.
The method is based on two interacting models: the autoregressive language ruGPT-3 model and the autoencoding language ruRoBERTa model.
Experiments on generating news articles using the proposed method showed its effectiveness for automatically generated fluent texts.
arXiv Detail & Related papers (2022-06-18T17:10:08Z) - Text Generation with Text-Editing Models [78.03750739936956]
This tutorial provides a comprehensive overview of text-editing models and current state-of-the-art approaches.
We discuss challenges related to productionization and how these models can be used to mitigate hallucination and bias.
arXiv Detail & Related papers (2022-06-14T17:58:17Z) - Text Revision by On-the-Fly Representation Optimization [76.11035270753757]
Current state-of-the-art methods formulate these tasks as sequence-to-sequence learning problems.
We present an iterative in-place editing approach for text revision, which requires no parallel data.
It achieves competitive and even better performance than state-of-the-art supervised methods on text simplification.
arXiv Detail & Related papers (2022-04-15T07:38:08Z) - Data-to-Text Generation with Iterative Text Editing [3.42658286826597]
We present a novel approach to data-to-text generation based on iterative text editing.
We first transform data items to text using trivial templates, and then we iteratively improve the resulting text by a neural model trained for the sentence fusion task.
The output of the model is filtered by a simple and reranked with an off-the-shelf pre-trained language model.
arXiv Detail & Related papers (2020-11-03T13:32:38Z) - Text Editing by Command [82.50904226312451]
A prevailing paradigm in neural text generation is one-shot generation, where text is produced in a single step.
We address this limitation with an interactive text generation setting in which the user interacts with the system by issuing commands to edit existing text.
We show that our Interactive Editor, a transformer-based model trained on this dataset, outperforms baselines and obtains positive results in both automatic and human evaluations.
arXiv Detail & Related papers (2020-10-24T08:00:30Z) - Cascaded Text Generation with Markov Transformers [122.76100449018061]
Two dominant approaches to neural text generation are fully autoregressive models, using serial beam search decoding, and non-autoregressive models, using parallel decoding with no output dependencies.
This work proposes an autoregressive model with sub-linear parallel time generation. Noting that conditional random fields with bounded context can be decoded in parallel, we propose an efficient cascaded decoding approach for generating high-quality output.
This approach requires only a small modification from standard autoregressive training, while showing competitive accuracy/speed tradeoff compared to existing methods on five machine translation datasets.
arXiv Detail & Related papers (2020-06-01T17:52:15Z) - POINTER: Constrained Progressive Text Generation via Insertion-based
Generative Pre-training [93.79766670391618]
We present POINTER, a novel insertion-based approach for hard-constrained text generation.
The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner.
The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable.
arXiv Detail & Related papers (2020-05-01T18:11:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.