Generating Syntactically Controlled Paraphrases without Using Annotated
Parallel Pairs
- URL: http://arxiv.org/abs/2101.10579v1
- Date: Tue, 26 Jan 2021 06:13:52 GMT
- Title: Generating Syntactically Controlled Paraphrases without Using Annotated
Parallel Pairs
- Authors: Kuan-Hao Huang, Kai-Wei Chang
- Abstract summary: We show that it is possible to generate syntactically various paraphrases without the need for annotated paraphrase pairs.
We propose Syntactically controlled Paraphrase Generator (SynPG), an encoder-decoder based model that learns to disentangle the semantics and the syntax of a sentence.
- Score: 37.808235216195484
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Paraphrase generation plays an essential role in natural language process
(NLP), and it has many downstream applications. However, training supervised
paraphrase models requires many annotated paraphrase pairs, which are usually
costly to obtain. On the other hand, the paraphrases generated by existing
unsupervised approaches are usually syntactically similar to the source
sentences and are limited in diversity. In this paper, we demonstrate that it
is possible to generate syntactically various paraphrases without the need for
annotated paraphrase pairs. We propose Syntactically controlled Paraphrase
Generator (SynPG), an encoder-decoder based model that learns to disentangle
the semantics and the syntax of a sentence from a collection of unannotated
texts. The disentanglement enables SynPG to control the syntax of output
paraphrases by manipulating the embedding in the syntactic space. Extensive
experiments using automatic metrics and human evaluation show that SynPG
performs better syntactic control than unsupervised baselines, while the
quality of the generated paraphrases is competitive. We also demonstrate that
the performance of SynPG is competitive or even better than supervised models
when the unannotated data is large. Finally, we show that the syntactically
controlled paraphrases generated by SynPG can be utilized for data augmentation
to improve the robustness of NLP models.
Related papers
- A Quality-based Syntactic Template Retriever for
Syntactically-controlled Paraphrase Generation [67.98367574025797]
Existing syntactically-controlled paraphrase generation models perform promisingly with human-annotated or well-chosen syntactic templates.
The prohibitive cost makes it unfeasible to manually design decent templates for every source sentence.
We propose a novel Quality-based Syntactic Template Retriever (QSTR) to retrieve templates based on the quality of the to-be-generated paraphrases.
arXiv Detail & Related papers (2023-10-20T03:55:39Z) - ParaAMR: A Large-Scale Syntactically Diverse Paraphrase Dataset by AMR
Back-Translation [59.91139600152296]
ParaAMR is a large-scale syntactically diverse paraphrase dataset created by abstract meaning representation back-translation.
We show that ParaAMR can be used to improve on three NLP tasks: learning sentence embeddings, syntactically controlled paraphrase generation, and data augmentation for few-shot learning.
arXiv Detail & Related papers (2023-05-26T02:27:33Z) - Unsupervised Syntactically Controlled Paraphrase Generation with
Abstract Meaning Representations [59.10748929158525]
Abstract Representations (AMR) can greatly improve the performance of unsupervised syntactically controlled paraphrase generation.
Our proposed model, AMR-enhanced Paraphrase Generator (AMRPG), encodes the AMR graph and the constituency parses the input sentence into two disentangled semantic and syntactic embeddings.
Experiments show that AMRPG generates more accurate syntactically controlled paraphrases, both quantitatively and qualitatively, compared to the existing unsupervised approaches.
arXiv Detail & Related papers (2022-11-02T04:58:38Z) - Hierarchical Sketch Induction for Paraphrase Generation [79.87892048285819]
We introduce Hierarchical Refinement Quantized Variational Autoencoders (HRQ-VAE), a method for learning decompositions of dense encodings.
We use HRQ-VAE to encode the syntactic form of an input sentence as a path through the hierarchy, allowing us to more easily predict syntactic sketches at test time.
arXiv Detail & Related papers (2022-03-07T15:28:36Z) - Syntax-guided Controlled Generation of Paraphrases [3.4129083593356433]
We propose Syntax Guided Controlled Paraphraser (SGCP), an end-to-end framework for syntactic paraphrase generation.
We find that SGCP can generate syntax conforming sentences while not compromising on relevance.
To drive future research, we have made SGCP's source code available.
arXiv Detail & Related papers (2020-05-18T01:31:28Z) - Neural Syntactic Preordering for Controlled Paraphrase Generation [57.5316011554622]
Our work uses syntactic transformations to softly "reorder'' the source sentence and guide our neural paraphrasing model.
First, given an input sentence, we derive a set of feasible syntactic rearrangements using an encoder-decoder model.
Next, we use each proposed rearrangement to produce a sequence of position embeddings, which encourages our final encoder-decoder paraphrase model to attend to the source words in a particular order.
arXiv Detail & Related papers (2020-05-05T09:02:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.