Text Generation with Deep Variational GAN
- URL: http://arxiv.org/abs/2104.13488v1
- Date: Tue, 27 Apr 2021 21:42:13 GMT
- Title: Text Generation with Deep Variational GAN
- Authors: Mahmoud Hossam, Trung Le, Michael Papasimeon, Viet Huynh, Dinh Phung
- Abstract summary: We propose a GAN-based generic framework to address the problem of mode-collapse in a principled approach.
We show that our model can generate realistic text with high diversity.
- Score: 16.3190206770276
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating realistic sequences is a central task in many machine learning
applications. There has been considerable recent progress on building deep
generative models for sequence generation tasks. However, the issue of
mode-collapsing remains a main issue for the current models. In this paper we
propose a GAN-based generic framework to address the problem of mode-collapse
in a principled approach. We change the standard GAN objective to maximize a
variational lower-bound of the log-likelihood while minimizing the
Jensen-Shanon divergence between data and model distributions. We experiment
our model with text generation task and show that it can generate realistic
text with high diversity.
Related papers
- Generative Multi-modal Models are Good Class-Incremental Learners [51.5648732517187]
We propose a novel generative multi-modal model (GMM) framework for class-incremental learning.
Our approach directly generates labels for images using an adapted generative model.
Under the Few-shot CIL setting, we have improved by at least 14% accuracy over all the current state-of-the-art methods with significantly less forgetting.
arXiv Detail & Related papers (2024-03-27T09:21:07Z) - PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model [37.2192243883707]
We propose PLANNER, a model that combines latent semantic diffusion with autoregressive generation to generate fluent text.
Results on semantic generation, text completion and summarization show its effectiveness in generating high-quality long-form text.
arXiv Detail & Related papers (2023-06-05T01:36:39Z) - Tailoring Language Generation Models under Total Variation Distance [55.89964205594829]
The standard paradigm of neural language generation adopts maximum likelihood estimation (MLE) as the optimizing method.
We develop practical bounds to apply it to language generation.
We introduce the TaiLr objective that balances the tradeoff of estimating TVD.
arXiv Detail & Related papers (2023-02-26T16:32:52Z) - Speculative Decoding with Big Little Decoder [108.95187338417541]
Big Little Decoder (BiLD) is a framework that can improve inference efficiency and latency for a wide range of text generation applications.
On an NVIDIA T4 GPU, our framework achieves a speedup of up to 2.12x speedup with minimal generation quality degradation.
Our framework is fully plug-and-play and can be applied without any modifications in the training process or model architecture.
arXiv Detail & Related papers (2023-02-15T18:55:29Z) - GENIE: Large Scale Pre-training for Text Generation with Diffusion Model [86.2022500090247]
GENIE is a sequence-to-sequence text generation model which combines Transformer and diffusion.
We propose a novel pre-training method named continuous paragraph denoise based on the characteristics of the diffusion model.
arXiv Detail & Related papers (2022-12-22T13:17:11Z) - DiffusER: Discrete Diffusion via Edit-based Reconstruction [88.62707047517914]
DiffusER is an edit-based generative model for text based on denoising diffusion models.
It can rival autoregressive models on several tasks spanning machine translation, summarization, and style transfer.
It can also perform other varieties of generation that standard autoregressive models are not well-suited for.
arXiv Detail & Related papers (2022-10-30T16:55:23Z) - Diverse Text Generation via Variational Encoder-Decoder Models with
Gaussian Process Priors [21.71928935339393]
We present a novel latent structured variable model to generate high quality texts.
Specifically, we introduce a function to map deterministic encoder hidden states into random context variables.
To address the learning challenge of Gaussian processes, we propose an efficient variational inference approach.
arXiv Detail & Related papers (2022-04-04T04:09:15Z) - Deep Latent-Variable Models for Text Generation [7.119436003155924]
Deep neural network-based end-to-end architectures have been widely adopted.
End-to-end approach conflates all sub-modules, which used to be designed by complex handcrafted rules, into a holistic encode-decode architecture.
This dissertation presents how deep latent-variable models can improve over the standard encoder-decoder model for text generation.
arXiv Detail & Related papers (2022-03-03T23:06:39Z) - Conditional Generative Modeling via Learning the Latent Space [54.620761775441046]
We propose a novel framework for conditional generation in multimodal spaces.
It uses latent variables to model generalizable learning patterns.
At inference, the latent variables are optimized to find optimal solutions corresponding to multiple output modes.
arXiv Detail & Related papers (2020-10-07T03:11:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.