Quality Controlled Paraphrase Generation
- URL: http://arxiv.org/abs/2203.10940v1
- Date: Mon, 21 Mar 2022 13:09:59 GMT
- Title: Quality Controlled Paraphrase Generation
- Authors: Elron Bandel, Ranit Aharonov, Michal Shmueli-Scheuer, Ilya
Shnayderman, Noam Slonim, Liat Ein-Dor
- Abstract summary: We propose a quality-guided controlled paraphrase generation model.
We show that our method is able to generate paraphrases which maintain the original meaning while achieving higher diversity than the uncontrolled baseline.
- Score: 13.796053459460207
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Paraphrase generation has been widely used in various downstream tasks. Most
tasks benefit mainly from high quality paraphrases, namely those that are
semantically similar to, yet linguistically diverse from, the original
sentence. Generating high-quality paraphrases is challenging as it becomes
increasingly hard to preserve meaning as linguistic diversity increases. Recent
works achieve nice results by controlling specific aspects of the paraphrase,
such as its syntactic tree. However, they do not allow to directly control the
quality of the generated paraphrase, and suffer from low flexibility and
scalability. Here we propose $QCPG$, a quality-guided controlled paraphrase
generation model, that allows directly controlling the quality dimensions.
Furthermore, we suggest a method that given a sentence, identifies points in
the quality control space that are expected to yield optimal generated
paraphrases. We show that our method is able to generate paraphrases which
maintain the original meaning while achieving higher diversity than the
uncontrolled baseline. The models, the code, and the data can be found in
https://github.com/IBM/quality-controlled-paraphrase-generation.
Related papers
- Enforcing Paraphrase Generation via Controllable Latent Diffusion [60.82512050963046]
We propose textitLatent textitDiffusion textitParaphraser(LDP), a novel paraphrase generation by modeling a controllable diffusion process.
Experiments show that LDP achieves improved and diverse paraphrase generation compared to baselines.
arXiv Detail & Related papers (2024-04-13T09:24:32Z) - ParaAMR: A Large-Scale Syntactically Diverse Paraphrase Dataset by AMR
Back-Translation [59.91139600152296]
ParaAMR is a large-scale syntactically diverse paraphrase dataset created by abstract meaning representation back-translation.
We show that ParaAMR can be used to improve on three NLP tasks: learning sentence embeddings, syntactically controlled paraphrase generation, and data augmentation for few-shot learning.
arXiv Detail & Related papers (2023-05-26T02:27:33Z) - Unsupervised Syntactically Controlled Paraphrase Generation with
Abstract Meaning Representations [59.10748929158525]
Abstract Representations (AMR) can greatly improve the performance of unsupervised syntactically controlled paraphrase generation.
Our proposed model, AMR-enhanced Paraphrase Generator (AMRPG), encodes the AMR graph and the constituency parses the input sentence into two disentangled semantic and syntactic embeddings.
Experiments show that AMRPG generates more accurate syntactically controlled paraphrases, both quantitatively and qualitatively, compared to the existing unsupervised approaches.
arXiv Detail & Related papers (2022-11-02T04:58:38Z) - Learning to Selectively Learn for Weakly-supervised Paraphrase
Generation [81.65399115750054]
We propose a novel approach to generate high-quality paraphrases with weak supervision data.
Specifically, we tackle the weakly-supervised paraphrase generation problem by:.
obtaining abundant weakly-labeled parallel sentences via retrieval-based pseudo paraphrase expansion.
We demonstrate that our approach achieves significant improvements over existing unsupervised approaches, and is even comparable in performance with supervised state-of-the-arts.
arXiv Detail & Related papers (2021-09-25T23:31:13Z) - Towards Document-Level Paraphrase Generation with Sentence Rewriting and
Reordering [88.08581016329398]
We propose CoRPG (Coherence Relationship guided Paraphrase Generation) for document-level paraphrase generation.
We use graph GRU to encode the coherence relationship graph and get the coherence-aware representation for each sentence.
Our model can generate document paraphrase with more diversity and semantic preservation.
arXiv Detail & Related papers (2021-09-15T05:53:40Z) - Pushing Paraphrase Away from Original Sentence: A Multi-Round Paraphrase
Generation Approach [97.38622477085188]
We propose BTmPG (Back-Translation guided multi-round Paraphrase Generation) to improve diversity of paraphrase.
We evaluate BTmPG on two benchmark datasets.
arXiv Detail & Related papers (2021-09-04T13:12:01Z) - SGG: Learning to Select, Guide, and Generate for Keyphrase Generation [38.351526320316786]
Keyphrases concisely summarize the high-level topics discussed in a document.
Most existing keyphrase generation approaches synchronously generate present and absent keyphrases.
We propose a Select-Guide-Generate (SGG) approach to deal with present and absent keyphrase generation separately.
arXiv Detail & Related papers (2021-05-06T09:43:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.