AriEL: volume coding for sentence generation
- URL: http://arxiv.org/abs/2003.13600v2
- Date: Tue, 21 Apr 2020 14:06:11 GMT
- Title: AriEL: volume coding for sentence generation
- Authors: Luca Celotti, Simon Brodeur, Jean Rouat
- Abstract summary: We improve on the performance of some of the standard methods in deep learning to generate sentences by uniformly sampling a continuous space.
We do it by proposing AriEL, that constructs volumes in a continuous space, without the need of encouraging the creation of volumes through the loss function.
Our results indicate that the random access to the stored information is dramatically improved, and our method AriEL is able to generate a wider variety of correct language by randomly sampling the latent space.
- Score: 5.972927416266617
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mapping sequences of discrete data to a point in a continuous space makes it
difficult to retrieve those sequences via random sampling. Mapping the input to
a volume would make it easier to retrieve at test time, and that's the strategy
followed by the family of approaches based on Variational Autoencoder. However
the fact that they are at the same time optimizing for prediction and for
smoothness of representation, forces them to trade-off between the two. We
improve on the performance of some of the standard methods in deep learning to
generate sentences by uniformly sampling a continuous space. We do it by
proposing AriEL, that constructs volumes in a continuous space, without the
need of encouraging the creation of volumes through the loss function. We first
benchmark on a toy grammar, that allows to automatically evaluate the language
learned and generated by the models. Then, we benchmark on a real dataset of
human dialogues. Our results indicate that the random access to the stored
information is dramatically improved, and our method AriEL is able to generate
a wider variety of correct language by randomly sampling the latent space. VAE
follows in performance for the toy dataset while, AE and Transformer follow for
the real dataset. This partially supports to the hypothesis that encoding
information into volumes instead of into points, can lead to improved retrieval
of learned information with random sampling. This can lead to better generators
and we also discuss potential disadvantages.
Related papers
- RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder
for Language Modeling [79.56442336234221]
We introduce RegaVAE, a retrieval-augmented language model built upon the variational auto-encoder (VAE)
It encodes the text corpus into a latent space, capturing current and future information from both source and target text.
Experimental results on various datasets demonstrate significant improvements in text generation quality and hallucination removal.
arXiv Detail & Related papers (2023-10-16T16:42:01Z) - Optimizing Factual Accuracy in Text Generation through Dynamic Knowledge
Selection [71.20871905457174]
Language models (LMs) have revolutionized the way we interact with information, but they often generate nonfactual text.
Previous methods use external knowledge as references for text generation to enhance factuality but often struggle with the knowledge mix-up of irrelevant references.
We present DKGen, which divide the text generation process into an iterative process.
arXiv Detail & Related papers (2023-08-30T02:22:40Z) - MomentDiff: Generative Video Moment Retrieval from Random to Real [71.40038773943638]
We provide a generative diffusion-based framework called MomentDiff.
MomentDiff simulates a typical human retrieval process from random browsing to gradual localization.
We show that MomentDiff consistently outperforms state-of-the-art methods on three public benchmarks.
arXiv Detail & Related papers (2023-07-06T09:12:13Z) - Improving the Robustness of Summarization Systems with Dual Augmentation [68.53139002203118]
A robust summarization system should be able to capture the gist of the document, regardless of the specific word choices or noise in the input.
We first explore the summarization models' robustness against perturbations including word-level synonym substitution and noise.
We propose a SummAttacker, which is an efficient approach to generating adversarial samples based on language models.
arXiv Detail & Related papers (2023-06-01T19:04:17Z) - Separating Grains from the Chaff: Using Data Filtering to Improve
Multilingual Translation for Low-Resourced African Languages [0.6947064688250465]
This work describes our approach, which is based on filtering the given noisy data using a sentence-pair classifier.
We empirically validate our approach by evaluating on two common datasets and show that data filtering generally improves overall translation quality.
arXiv Detail & Related papers (2022-10-19T16:12:27Z) - Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs [16.968490007064872]
Applying variational autoencoders (VAEs) to sequential data offers a method for controlled sequence generation, manipulation, and structured representation learning.
We show theoretically that this removes pointwise mutual information provided by the decoder input, which is compensated for by utilizing the latent space.
Compared to uniform dropout on standard text benchmark datasets, our targeted approach increases both sequence performance and the information captured in the latent space.
arXiv Detail & Related papers (2022-09-26T11:21:19Z) - Learning to Ask Conversational Questions by Optimizing Levenshtein
Distance [83.53855889592734]
We introduce a Reinforcement Iterative Sequence Editing (RISE) framework that optimize the minimum Levenshtein distance (MLD) through explicit editing actions.
RISE is able to pay attention to tokens that are related to conversational characteristics.
Experimental results on two benchmark datasets show that RISE significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-06-30T08:44:19Z) - Representation Learning for Sequence Data with Deep Autoencoding
Predictive Components [96.42805872177067]
We propose a self-supervised representation learning method for sequence data, based on the intuition that useful representations of sequence data should exhibit a simple structure in the latent space.
We encourage this latent structure by maximizing an estimate of predictive information of latent feature sequences, which is the mutual information between past and future windows at each time step.
We demonstrate that our method recovers the latent space of noisy dynamical systems, extracts predictive features for forecasting tasks, and improves automatic speech recognition when used to pretrain the encoder on large amounts of unlabeled data.
arXiv Detail & Related papers (2020-10-07T03:34:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.