Variational Autoencoder with Disentanglement Priors for Low-Resource
Task-Specific Natural Language Generation
- URL: http://arxiv.org/abs/2202.13363v1
- Date: Sun, 27 Feb 2022 13:34:24 GMT
- Title: Variational Autoencoder with Disentanglement Priors for Low-Resource
Task-Specific Natural Language Generation
- Authors: Zhuang Li, Lizhen Qu, Qiongkai Xu, Tongtong Wu, Tianyang Zhan,
Gholamreza Haffari
- Abstract summary: We propose a variational autoencoder with disentanglement priors, VAE-DPRIOR, for conditional natural language generation.
Our model performs disentangled representation learning by introducing a prior for the latent content space and another prior for the latent label space.
- Score: 48.09206838892326
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a variational autoencoder with disentanglement
priors, VAE-DPRIOR, for conditional natural language generation with none or a
handful of task-specific labeled examples. In order to improve compositional
generalization, our model performs disentangled representation learning by
introducing a prior for the latent content space and another prior for the
latent label space. We show both empirically and theoretically that the
conditional priors can already disentangle representations even without
specific regularizations as in the prior work. We can also sample diverse
content representations from the content space without accessing data of the
seen tasks, and fuse them with the representations of novel tasks for
generating diverse texts in the low-resource settings. Our extensive
experiments demonstrate the superior performance of our model over competitive
baselines in terms of i) data augmentation in continuous zero/few-shot
learning, and ii) text style transfer in both zero/few-shot settings.
Related papers
- RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder
for Language Modeling [79.56442336234221]
We introduce RegaVAE, a retrieval-augmented language model built upon the variational auto-encoder (VAE)
It encodes the text corpus into a latent space, capturing current and future information from both source and target text.
Experimental results on various datasets demonstrate significant improvements in text generation quality and hallucination removal.
arXiv Detail & Related papers (2023-10-16T16:42:01Z) - Instruction Position Matters in Sequence Generation with Large Language
Models [67.87516654892343]
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization.
We propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences.
arXiv Detail & Related papers (2023-08-23T12:36:57Z) - Leveraging Natural Supervision for Language Representation Learning and
Generation [8.083109555490475]
We describe three lines of work that seek to improve the training and evaluation of neural models using naturally-occurring supervision.
We first investigate self-supervised training losses to help enhance the performance of pretrained language models for various NLP tasks.
We propose a framework that uses paraphrase pairs to disentangle semantics and syntax in sentence representations.
arXiv Detail & Related papers (2022-07-21T17:26:03Z) - Multimodal Knowledge Alignment with Reinforcement Learning [103.68816413817372]
ESPER extends language-only zero-shot models to unseen multimodal tasks, like image and audio captioning.
Our key novelty is to use reinforcement learning to align multimodal inputs to language model generations without direct supervision.
Experiments demonstrate that ESPER outperforms baselines and prior work on a variety of zero-shot tasks.
arXiv Detail & Related papers (2022-05-25T10:12:17Z) - Grad2Task: Improved Few-shot Text Classification Using Gradients for
Task Representation [24.488427641442694]
We propose a novel conditional neural process-based approach for few-shot text classification.
Our key idea is to represent each task using gradient information from a base model.
Our approach outperforms traditional fine-tuning, sequential transfer learning, and state-of-the-art meta learning approaches.
arXiv Detail & Related papers (2022-01-27T15:29:30Z) - DSGPT: Domain-Specific Generative Pre-Training of Transformers for Text
Generation in E-commerce Title and Review Summarization [14.414693156937782]
We propose a novel domain-specific generative pre-training (DS-GPT) method for text generation.
We apply it to the product titleand review summarization problems on E-commerce mobile display.
arXiv Detail & Related papers (2021-12-15T19:02:49Z) - Continual Learning for Text Classification with Information
Disentanglement Based Regularization [18.258948837964724]
We propose an information disentanglement based regularization method for continual learning on text classification.
Experiments conducted on large-scale benchmarks demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2021-04-12T14:17:43Z) - KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation [100.79870384880333]
We propose a knowledge-grounded pre-training (KGPT) to generate knowledge-enriched text.
We adopt three settings, namely fully-supervised, zero-shot, few-shot to evaluate its effectiveness.
Under zero-shot setting, our model achieves over 30 ROUGE-L on WebNLG while all other baselines fail.
arXiv Detail & Related papers (2020-10-05T19:59:05Z) - Pre-training via Paraphrasing [96.79972492585112]
We introduce MARGE, a pre-trained sequence-to-sequence model learned with an unsupervised multi-lingual paraphrasing objective.
We show it is possible to jointly learn to do retrieval and reconstruction, given only a random initialization.
For example, with no additional task-specific training we achieve BLEU scores of up to 35.8 for document translation.
arXiv Detail & Related papers (2020-06-26T14:43:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.