Related papers: Deep Latent-Variable Models for Text Generation

Deep Latent-Variable Models for Text Generation

URL: http://arxiv.org/abs/2203.02055v1
Date: Thu, 3 Mar 2022 23:06:39 GMT
Title: Deep Latent-Variable Models for Text Generation
Authors: Xiaoyu Shen
Abstract summary: Deep neural network-based end-to-end architectures have been widely adopted. End-to-end approach conflates all sub-modules, which used to be designed by complex handcrafted rules, into a holistic encode-decode architecture. This dissertation presents how deep latent-variable models can improve over the standard encoder-decoder model for text generation.
Score: 7.119436003155924
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Text generation aims to produce human-like natural language output for down-stream tasks. It covers a wide range of applications like machine translation, document summarization, dialogue generation and so on. Recently deep neural network-based end-to-end architectures have been widely adopted. The end-to-end approach conflates all sub-modules, which used to be designed by complex handcrafted rules, into a holistic encode-decode architecture. Given enough training data, it is able to achieve state-of-the-art performance yet avoiding the need of language/domain-dependent knowledge. Nonetheless, deep learning models are known to be extremely data-hungry, and text generated from them usually suffer from low diversity, interpretability and controllability. As a result, it is difficult to trust the output from them in real-life applications. Deep latent-variable models, by specifying the probabilistic distribution over an intermediate latent process, provide a potential way of addressing these problems while maintaining the expressive power of deep neural networks. This dissertation presents how deep latent-variable models can improve over the standard encoder-decoder model for text generation.

Related papers

Text2Data: Low-Resource Data Generation with Textual Control [100.5970757736845]
Text2Data is a novel approach that utilizes unlabeled data to understand the underlying data distribution. It undergoes finetuning via a novel constraint optimization-based learning objective that ensures controllability and effectively counteracts catastrophic forgetting.
arXiv Detail & Related papers (2024-02-08T03:41:39Z)
Vector-Quantized Prompt Learning for Paraphrase Generation [18.40940464497253]
This paper proposes to generate diverse and high-quality paraphrases by exploiting the pre-trained models with instance-dependent prompts. Extensive experiments demonstrate that the proposed method achieves new state-of-art results on three benchmark datasets.
arXiv Detail & Related papers (2023-11-25T07:13:06Z)
Grounded Decoding: Guiding Text Generation with Grounded Models for Embodied Agents [111.15288256221764]
Grounded-decoding project aims to solve complex, long-horizon tasks in a robotic setting by leveraging the knowledge of both models. We frame this as a problem similar to probabilistic filtering: decode a sequence that both has high probability under the language model and high probability under a set of grounded model objectives. We demonstrate how such grounded models can be obtained across three simulation and real-world domains, and that the proposed decoding strategy is able to solve complex, long-horizon tasks in a robotic setting by leveraging the knowledge of both models.
arXiv Detail & Related papers (2023-03-01T22:58:50Z)
An Overview on Controllable Text Generation via Variational Auto-Encoders [15.97186478109836]
Recent advances in neural-based generative modeling have reignited the hopes of having computer systems capable of conversing with humans. Latent variable models (LVM) such as variational auto-encoders (VAEs) are designed to characterize the distributional pattern of textual data. This overview gives an introduction to existing generation schemes, problems associated with text variational auto-encoders, and a review of several applications about the controllable generation.
arXiv Detail & Related papers (2022-11-15T07:36:11Z)
Comparing Computational Architectures for Automated Journalism [0.0]
This study compares the most often employed methods for generating Brazilian Portuguese texts from structured data. Results suggest that explicit intermediate steps in the generation process produce better texts than the ones generated by neural end-to-end architectures.
arXiv Detail & Related papers (2022-10-08T21:20:52Z)
The Whole Truth and Nothing But the Truth: Faithful and Controllable Dialogue Response Generation with Dataflow Transduction and Constrained Decoding [65.34601470417967]
We describe a hybrid architecture for dialogue response generation that combines the strengths of neural language modeling and rule-based generation. Our experiments show that this system outperforms both rule-based and learned approaches in human evaluations of fluency, relevance, and truthfulness.
arXiv Detail & Related papers (2022-09-16T09:00:49Z)
Twist Decoding: Diverse Generators Guide Each Other [116.20780037268801]
We introduce Twist decoding, a simple and general inference algorithm that generates text while benefiting from diverse models. Our method does not assume the vocabulary, tokenization or even generation order is shared.
arXiv Detail & Related papers (2022-05-19T01:27:53Z)
Distributionally Robust Recurrent Decoders with Random Network Distillation [93.10261573696788]
We propose a method based on OOD detection with Random Network Distillation to allow an autoregressive language model to disregard OOD context during inference. We apply our method to a GRU architecture, demonstrating improvements on multiple language modeling (LM) datasets.
arXiv Detail & Related papers (2021-10-25T19:26:29Z)
SDA: Improving Text Generation with Self Data Augmentation [88.24594090105899]
We propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation. Unlike most existing sentence-level augmentation strategies, our method is more general and could be easily adapted to any MLE-based training procedure.
arXiv Detail & Related papers (2021-01-02T01:15:57Z)
Neural Language Generation: Formulation, Methods, and Evaluation [13.62873478165553]
Recent advances in neural network-based generative modeling have reignited the hopes in having computer systems capable of seamlessly conversing with humans. High capacity deep learning models trained on large scale datasets demonstrate unparalleled abilities to learn patterns in the data even in the lack of explicit supervision signals. There is no standard way to assess the quality of text produced by these generative models, which constitutes a serious bottleneck towards the progress of the field.
arXiv Detail & Related papers (2020-07-31T00:08:28Z)
Improve Variational Autoencoder for Text Generationwith Discrete Latent Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning. VAEs tend to ignore latent variables with a strong auto-regressive decoder. We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.