Generative Pre-training for Paraphrase Generation by Representing and
Predicting Spans in Exemplars
- URL: http://arxiv.org/abs/2011.14344v1
- Date: Sun, 29 Nov 2020 11:36:13 GMT
- Title: Generative Pre-training for Paraphrase Generation by Representing and
Predicting Spans in Exemplars
- Authors: Tien-Cuong Bui, Van-Duc Le, Hai-Thien To and Sang Kyun Cha
- Abstract summary: This paper presents a novel approach to paraphrasing sentences, extended from the GPT-2 model.
We develop a template masking technique, named first-order masking, to masked out irrelevant words in exemplars utilizing POS taggers.
Our proposed approach outperforms competitive baselines, especially in the semantic preservation aspect.
- Score: 0.8411385346896411
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Paraphrase generation is a long-standing problem and serves an essential role
in many natural language processing problems. Despite some encouraging results,
recent methods either confront the problem of favoring generic utterance or
need to retrain the model from scratch for each new dataset. This paper
presents a novel approach to paraphrasing sentences, extended from the GPT-2
model. We develop a template masking technique, named first-order masking, to
masked out irrelevant words in exemplars utilizing POS taggers. So that, the
paraphrasing task is changed to predicting spans in masked templates. Our
proposed approach outperforms competitive baselines, especially in the semantic
preservation aspect. To prevent the model from being biased towards a given
template, we introduce a technique, referred to as second-order masking, which
utilizes Bernoulli distribution to control the visibility of the
first-order-masked template's tokens. Moreover, this technique allows the model
to provide various paraphrased sentences in testing by adjusting the
second-order-masking level. For scale-up objectives, we compare the performance
of two alternatives template-selection methods, which shows that they were
equivalent in preserving semantic information.
Related papers
- Translate First Reorder Later: Leveraging Monotonicity in Semantic
Parsing [4.396860522241306]
TPol is a two-step approach that translates input sentences monotonically and then reorders them to obtain the correct output.
We test our approach on two popular semantic parsing datasets.
arXiv Detail & Related papers (2022-10-10T17:50:42Z) - Retrieve-and-Fill for Scenario-based Task-Oriented Semantic Parsing [110.4684789199555]
We introduce scenario-based semantic parsing: a variant of the original task which first requires disambiguating an utterance's "scenario"
This formulation enables us to isolate coarse-grained and fine-grained aspects of the task, each of which we solve with off-the-shelf neural modules.
Our model is modular, differentiable, interpretable, and allows us to garner extra supervision from scenarios.
arXiv Detail & Related papers (2022-02-02T08:00:21Z) - Paraphrase Generation as Unsupervised Machine Translation [30.99150547499427]
We propose a new paradigm for paraphrase generation by treating the task as unsupervised machine translation (UMT)
The proposed paradigm first splits a large unlabeled corpus into multiple clusters, and trains multiple UMT models using pairs of these clusters.
Then based on the paraphrase pairs produced by these UMT models, a unified surrogate model can be trained to serve as the final Seq2Seq model to generate paraphrases.
arXiv Detail & Related papers (2021-09-07T09:08:58Z) - Mask-Align: Self-Supervised Neural Word Alignment [47.016975106231875]
Mask-Align is a self-supervised model specifically designed for the word alignment task.
Our model parallelly masks and predicts each target token, and extracts high-quality alignments without any supervised loss.
arXiv Detail & Related papers (2020-12-13T21:44:29Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Neural Mask Generator: Learning to Generate Adaptive Word Maskings for
Language Model Adaptation [63.195935452646815]
We propose a method to automatically generate a domain- and task-adaptive maskings of the given text for self-supervised pre-training.
We present a novel reinforcement learning-based framework which learns the masking policy.
We validate our Neural Mask Generator (NMG) on several question answering and text classification datasets.
arXiv Detail & Related papers (2020-10-06T13:27:01Z) - Masking as an Efficient Alternative to Finetuning for Pretrained
Language Models [49.64561153284428]
We learn selective binary masks for pretrained weights in lieu of modifying them through finetuning.
In intrinsic evaluations, we show that representations computed by masked language models encode information necessary for solving downstream tasks.
arXiv Detail & Related papers (2020-04-26T15:03:47Z) - UniLMv2: Pseudo-Masked Language Models for Unified Language Model
Pre-Training [152.63467944568094]
We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks.
Our experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of natural language understanding and generation tasks.
arXiv Detail & Related papers (2020-02-28T15:28:49Z) - Semi-Autoregressive Training Improves Mask-Predict Decoding [119.8412758943192]
We introduce a new training method for conditional masked language models, SMART, which mimics the semi-autoregressive behavior of mask-predict.
Models trained with SMART produce higher-quality translations when using mask-predict decoding, effectively closing the remaining performance gap with fully autoregressive models.
arXiv Detail & Related papers (2020-01-23T19:56:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.