Language as a Latent Sequence: deep latent variable models for
semi-supervised paraphrase generation
- URL: http://arxiv.org/abs/2301.02275v2
- Date: Fri, 8 Sep 2023 21:03:14 GMT
- Title: Language as a Latent Sequence: deep latent variable models for
semi-supervised paraphrase generation
- Authors: Jialin Yu, Alexandra I. Cristea, Anoushka Harit, Zhongtian Sun,
Olanrewaju Tahir Aduragba, Lei Shi, Noura Al Moubayed
- Abstract summary: We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which performs latent sequence inference given an observed text.
To leverage information from text pairs, we additionally introduce a novel supervised model we call dual directional learning (DDL), which is designed to integrate with our proposed VSAR model.
Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data.
- Score: 47.33223015862104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper explores deep latent variable models for semi-supervised
paraphrase generation, where the missing target pair for unlabelled data is
modelled as a latent paraphrase sequence. We present a novel unsupervised model
named variational sequence auto-encoding reconstruction (VSAR), which performs
latent sequence inference given an observed text. To leverage information from
text pairs, we additionally introduce a novel supervised model we call dual
directional learning (DDL), which is designed to integrate with our proposed
VSAR model. Combining VSAR with DDL (DDL+VSAR) enables us to conduct
semi-supervised learning. Still, the combined model suffers from a cold-start
problem. To further combat this issue, we propose an improved weight
initialisation solution, leading to a novel two-stage training scheme we call
knowledge-reinforced-learning (KRL). Our empirical evaluations suggest that the
combined model yields competitive performance against the state-of-the-art
supervised baselines on complete data. Furthermore, in scenarios where only a
fraction of the labelled pairs are available, our combined model consistently
outperforms the strong supervised model baseline (DDL) by a significant margin
(p <.05; Wilcoxon test). Our code is publicly available at
"https://github.com/jialin-yu/latent-sequence-paraphrase".
Related papers
- Deep Companion Learning: Enhancing Generalization Through Historical Consistency [35.5237083057451]
We propose a novel training method for Deep Neural Networks (DNNs) that enhances generalization by penalizing inconsistent model predictions.
We train a deep-companion model (DCM) by using previous versions of the model to provide forecasts on new inputs.
This companion model deciphers a meaningful latent semantic structure within the data, thereby providing targeted supervision.
arXiv Detail & Related papers (2024-07-26T15:31:13Z) - RDR: the Recap, Deliberate, and Respond Method for Enhanced Language
Understanding [6.738409533239947]
The Recap, Deliberate, and Respond (RDR) paradigm addresses this issue by incorporating three distinct objectives within the neural network pipeline.
By cascading these three models, we mitigate the potential for gaming the benchmark and establish a robust method for capturing the underlying semantic patterns.
Our results demonstrate improved performance compared to competitive baselines, with an enhancement of up to 2% on standard metrics.
arXiv Detail & Related papers (2023-12-15T16:41:48Z) - SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking [60.109453252858806]
A maximum-likelihood (MLE) objective does not match a downstream use-case of autoregressively generating high-quality sequences.
We formulate sequence generation as an imitation learning (IL) problem.
This allows us to minimize a variety of divergences between the distribution of sequences generated by an autoregressive model and sequences from a dataset.
Our resulting method, SequenceMatch, can be implemented without adversarial training or architectural changes.
arXiv Detail & Related papers (2023-06-08T17:59:58Z) - Unsupervised Syntactically Controlled Paraphrase Generation with
Abstract Meaning Representations [59.10748929158525]
Abstract Representations (AMR) can greatly improve the performance of unsupervised syntactically controlled paraphrase generation.
Our proposed model, AMR-enhanced Paraphrase Generator (AMRPG), encodes the AMR graph and the constituency parses the input sentence into two disentangled semantic and syntactic embeddings.
Experiments show that AMRPG generates more accurate syntactically controlled paraphrases, both quantitatively and qualitatively, compared to the existing unsupervised approaches.
arXiv Detail & Related papers (2022-11-02T04:58:38Z) - Assemble Foundation Models for Automatic Code Summarization [9.53949558569201]
We propose a flexible and robust approach for automatic code summarization based on neural networks.
We assemble available foundation models, such as CodeBERT and GPT-2, into a single model named AdaMo.
We introduce two adaptive schemes from the perspective of knowledge transfer, namely continuous pretraining and intermediate finetuning.
arXiv Detail & Related papers (2022-01-13T21:38:33Z) - A Correspondence Variational Autoencoder for Unsupervised Acoustic Word
Embeddings [50.524054820564395]
We propose a new unsupervised model for mapping a variable-duration speech segment to a fixed-dimensional representation.
The resulting acoustic word embeddings can form the basis of search, discovery, and indexing systems for low- and zero-resource languages.
arXiv Detail & Related papers (2020-12-03T19:24:42Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Self-Supervised Contrastive Learning for Unsupervised Phoneme
Segmentation [37.054709598792165]
The model is a convolutional neural network that operates directly on the raw waveform.
It is optimized to identify spectral changes in the signal using the Noise-Contrastive Estimation principle.
At test time, a peak detection algorithm is applied over the model outputs to produce the final boundaries.
arXiv Detail & Related papers (2020-07-27T12:10:21Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.