Factorising Meaning and Form for Intent-Preserving Paraphrasing
- URL: http://arxiv.org/abs/2105.15053v1
- Date: Mon, 31 May 2021 15:37:38 GMT
- Title: Factorising Meaning and Form for Intent-Preserving Paraphrasing
- Authors: Tom Hosking, Mirella Lapata
- Abstract summary: We propose a method for generating paraphrases of English questions that retain the original intent but use a different surface form.
Our model combines a careful choice of training objective with a principled information bottleneck.
We are able to generate paraphrases with a better tradeoff between semantic preservation and syntactic novelty compared to previous methods.
- Score: 59.13322531639124
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a method for generating paraphrases of English questions that
retain the original intent but use a different surface form. Our model combines
a careful choice of training objective with a principled information
bottleneck, to induce a latent encoding space that disentangles meaning and
form. We train an encoder-decoder model to reconstruct a question from a
paraphrase with the same meaning and an exemplar with the same surface form,
leading to separated encoding spaces. We use a Vector-Quantized Variational
Autoencoder to represent the surface form as a set of discrete latent
variables, allowing us to use a classifier to select a different surface form
at test time. Crucially, our method does not require access to an external
source of target exemplars. Extensive experiments and a human evaluation show
that we are able to generate paraphrases with a better tradeoff between
semantic preservation and syntactic novelty compared to previous methods.
Related papers
- Investigating semantic subspaces of Transformer sentence embeddings
through linear structural probing [2.5002227227256864]
We present experiments with semantic structural probing, a method for studying sentence-level representations.
We apply our method to language models from different families (encoder-only, decoder-only, encoder-decoder) and of different sizes in the context of two tasks.
We find that model families differ substantially in their performance and layer dynamics, but that the results are largely model-size invariant.
arXiv Detail & Related papers (2023-10-18T12:32:07Z) - A Sparsity-promoting Dictionary Model for Variational Autoencoders [16.61511959679188]
Structuring the latent space in deep generative models is important to yield more expressive models and interpretable representations.
We propose a simple yet effective methodology to structure the latent space via a sparsity-promoting dictionary model.
arXiv Detail & Related papers (2022-03-29T17:13:11Z) - Hierarchical Sketch Induction for Paraphrase Generation [79.87892048285819]
We introduce Hierarchical Refinement Quantized Variational Autoencoders (HRQ-VAE), a method for learning decompositions of dense encodings.
We use HRQ-VAE to encode the syntactic form of an input sentence as a path through the hierarchy, allowing us to more easily predict syntactic sketches at test time.
arXiv Detail & Related papers (2022-03-07T15:28:36Z) - Sentence Bottleneck Autoencoders from Transformer Language Models [53.350633961266375]
We build a sentence-level autoencoder from a pretrained, frozen transformer language model.
We adapt the masked language modeling objective as a generative, denoising one, while only training a sentence bottleneck and a single-layer modified transformer decoder.
We demonstrate that the sentence representations discovered by our model achieve better quality than previous methods that extract representations from pretrained transformers on text similarity tasks, style transfer, and single-sentence classification tasks in the GLUE benchmark, while using fewer parameters than large pretrained models.
arXiv Detail & Related papers (2021-08-31T19:39:55Z) - Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency.
We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z) - Neural Syntactic Preordering for Controlled Paraphrase Generation [57.5316011554622]
Our work uses syntactic transformations to softly "reorder'' the source sentence and guide our neural paraphrasing model.
First, given an input sentence, we derive a set of feasible syntactic rearrangements using an encoder-decoder model.
Next, we use each proposed rearrangement to produce a sequence of position embeddings, which encourages our final encoder-decoder paraphrase model to attend to the source words in a particular order.
arXiv Detail & Related papers (2020-05-05T09:02:25Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.