Neural Syntactic Preordering for Controlled Paraphrase Generation
- URL: http://arxiv.org/abs/2005.02013v1
- Date: Tue, 5 May 2020 09:02:25 GMT
- Title: Neural Syntactic Preordering for Controlled Paraphrase Generation
- Authors: Tanya Goyal and Greg Durrett
- Abstract summary: Our work uses syntactic transformations to softly "reorder'' the source sentence and guide our neural paraphrasing model.
First, given an input sentence, we derive a set of feasible syntactic rearrangements using an encoder-decoder model.
Next, we use each proposed rearrangement to produce a sequence of position embeddings, which encourages our final encoder-decoder paraphrase model to attend to the source words in a particular order.
- Score: 57.5316011554622
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Paraphrasing natural language sentences is a multifaceted process: it might
involve replacing individual words or short phrases, local rearrangement of
content, or high-level restructuring like topicalization or passivization. Past
approaches struggle to cover this space of paraphrase possibilities in an
interpretable manner. Our work, inspired by pre-ordering literature in machine
translation, uses syntactic transformations to softly "reorder'' the source
sentence and guide our neural paraphrasing model. First, given an input
sentence, we derive a set of feasible syntactic rearrangements using an
encoder-decoder model. This model operates over a partially lexical, partially
syntactic view of the sentence and can reorder big chunks. Next, we use each
proposed rearrangement to produce a sequence of position embeddings, which
encourages our final encoder-decoder paraphrase model to attend to the source
words in a particular order. Our evaluation, both automatic and human, shows
that the proposed system retains the quality of the baseline approaches while
giving a substantial increase in the diversity of the generated paraphrases
Related papers
- Neural paraphrasing by automatically crawled and aligned sentence pairs [11.95795974003684]
The main obstacle toward neural-network-based paraphrasing is the lack of large datasets with aligned pairs of sentences and paraphrases.
We present a method for the automatic generation of large aligned corpora, that is based on the assumption that news and blog websites talk about the same events using different narrative styles.
We propose a similarity search procedure with linguistic constraints that, given a reference sentence, is able to locate the most similar candidate paraphrases out from millions of indexed sentences.
arXiv Detail & Related papers (2024-02-16T10:40:38Z) - ParaAMR: A Large-Scale Syntactically Diverse Paraphrase Dataset by AMR
Back-Translation [59.91139600152296]
ParaAMR is a large-scale syntactically diverse paraphrase dataset created by abstract meaning representation back-translation.
We show that ParaAMR can be used to improve on three NLP tasks: learning sentence embeddings, syntactically controlled paraphrase generation, and data augmentation for few-shot learning.
arXiv Detail & Related papers (2023-05-26T02:27:33Z) - Hierarchical Sketch Induction for Paraphrase Generation [79.87892048285819]
We introduce Hierarchical Refinement Quantized Variational Autoencoders (HRQ-VAE), a method for learning decompositions of dense encodings.
We use HRQ-VAE to encode the syntactic form of an input sentence as a path through the hierarchy, allowing us to more easily predict syntactic sketches at test time.
arXiv Detail & Related papers (2022-03-07T15:28:36Z) - Towards Document-Level Paraphrase Generation with Sentence Rewriting and
Reordering [88.08581016329398]
We propose CoRPG (Coherence Relationship guided Paraphrase Generation) for document-level paraphrase generation.
We use graph GRU to encode the coherence relationship graph and get the coherence-aware representation for each sentence.
Our model can generate document paraphrase with more diversity and semantic preservation.
arXiv Detail & Related papers (2021-09-15T05:53:40Z) - Controllable Text Simplification with Explicit Paraphrasing [88.02804405275785]
Text Simplification improves the readability of sentences through several rewriting transformations, such as lexical paraphrasing, deletion, and splitting.
Current simplification systems are predominantly sequence-to-sequence models that are trained end-to-end to perform all these operations simultaneously.
We propose a novel hybrid approach that leverages linguistically-motivated rules for splitting and deletion, and couples them with a neural paraphrasing model to produce varied rewriting styles.
arXiv Detail & Related papers (2020-10-21T13:44:40Z) - Cross-Thought for Sentence Encoder Pre-training [89.32270059777025]
Cross-Thought is a novel approach to pre-training sequence encoder.
We train a Transformer-based sequence encoder over a large set of short sequences.
Experiments on question answering and textual entailment tasks demonstrate that our pre-trained encoder can outperform state-of-the-art encoders.
arXiv Detail & Related papers (2020-10-07T21:02:41Z) - Vector Quantized Contrastive Predictive Coding for Template-based Music
Generation [0.0]
We propose a flexible method for generating variations of discrete sequences in which tokens can be grouped into basic units.
We show how these compressed representations can be used to generate variations of a template sequence by using an appropriate attention pattern in the Transformer architecture.
arXiv Detail & Related papers (2020-04-21T15:58:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.