A Probabilistic Formulation of Unsupervised Text Style Transfer
- URL: http://arxiv.org/abs/2002.03912v3
- Date: Wed, 29 Apr 2020 23:26:16 GMT
- Title: A Probabilistic Formulation of Unsupervised Text Style Transfer
- Authors: Junxian He, Xinyi Wang, Graham Neubig, Taylor Berg-Kirkpatrick
- Abstract summary: We present a deep generative model for unsupervised text style transfer that unifies previously proposed non-generative techniques.
By hypothesizing a parallel latent sequence that generates each observed sequence, our model learns to transform sequences from one domain to another in a completely unsupervised fashion.
- Score: 128.80213211598752
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a deep generative model for unsupervised text style transfer that
unifies previously proposed non-generative techniques. Our probabilistic
approach models non-parallel data from two domains as a partially observed
parallel corpus. By hypothesizing a parallel latent sequence that generates
each observed sequence, our model learns to transform sequences from one domain
to another in a completely unsupervised fashion. In contrast with traditional
generative sequence models (e.g. the HMM), our model makes few assumptions
about the data it generates: it uses a recurrent language model as a prior and
an encoder-decoder as a transduction distribution. While computation of
marginal data likelihood is intractable in this model class, we show that
amortized variational inference admits a practical surrogate. Further, by
drawing connections between our variational objective and other recent
unsupervised style transfer and machine translation techniques, we show how our
probabilistic view can unify some known non-generative objectives such as
backtranslation and adversarial loss. Finally, we demonstrate the effectiveness
of our method on a wide range of unsupervised style transfer tasks, including
sentiment transfer, formality transfer, word decipherment, author imitation,
and related language translation. Across all style transfer tasks, our approach
yields substantial gains over state-of-the-art non-generative baselines,
including the state-of-the-art unsupervised machine translation techniques that
our approach generalizes. Further, we conduct experiments on a standard
unsupervised machine translation task and find that our unified approach
matches the current state-of-the-art.
Related papers
- Unsupervised Representation Learning from Sparse Transformation Analysis [79.94858534887801]
We propose to learn representations from sequence data by factorizing the transformations of the latent variables into sparse components.
Input data are first encoded as distributions of latent activations and subsequently transformed using a probability flow model.
arXiv Detail & Related papers (2024-10-07T23:53:25Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - Learning Semantic Textual Similarity via Topic-informed Discrete Latent
Variables [17.57873577962635]
We develop a topic-informed discrete latent variable model for semantic textual similarity.
Our model learns a shared latent space for sentence-pair representation via vector quantization.
We show that our model is able to surpass several strong neural baselines in semantic textual similarity tasks.
arXiv Detail & Related papers (2022-11-07T15:09:58Z) - Principled Paraphrase Generation with Parallel Corpora [52.78059089341062]
We formalize the implicit similarity function induced by round-trip Machine Translation.
We show that it is susceptible to non-paraphrase pairs sharing a single ambiguous translation.
We design an alternative similarity metric that mitigates this issue.
arXiv Detail & Related papers (2022-05-24T17:22:42Z) - Non-Parallel Text Style Transfer with Self-Parallel Supervision [19.441780035577352]
We propose LaMer, a novel text style transfer framework based on large-scale language models.
LaMer first mines the roughly parallel expressions in the non-parallel datasets with scene graphs, and then employs MLE training, followed by imitation learning refinement, to leverage the intrinsic parallelism within the data.
On two benchmark tasks (sentiment & formality transfer) and a newly proposed challenging task (political stance transfer), our model achieves qualitative advances in transfer accuracy, content preservation, and fluency.
arXiv Detail & Related papers (2022-04-18T01:38:35Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Semi-supervised Formality Style Transfer using Language Model
Discriminator and Mutual Information Maximization [52.867459839641526]
Formality style transfer is the task of converting informal sentences to grammatically-correct formal sentences.
We propose a semi-supervised formality style transfer model that utilizes a language model-based discriminator to maximize the likelihood of the output sentence being formal.
Experiments showed that our model outperformed previous state-of-the-art baselines significantly in terms of both automated metrics and human judgement.
arXiv Detail & Related papers (2020-10-10T21:05:56Z) - Self-Supervised Contrastive Learning for Unsupervised Phoneme
Segmentation [37.054709598792165]
The model is a convolutional neural network that operates directly on the raw waveform.
It is optimized to identify spectral changes in the signal using the Noise-Contrastive Estimation principle.
At test time, a peak detection algorithm is applied over the model outputs to produce the final boundaries.
arXiv Detail & Related papers (2020-07-27T12:10:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.