RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder
for Language Modeling
- URL: http://arxiv.org/abs/2310.10567v2
- Date: Mon, 23 Oct 2023 12:16:44 GMT
- Title: RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder
for Language Modeling
- Authors: Jingcheng Deng, Liang Pang, Huawei Shen, Xueqi Cheng
- Abstract summary: We introduce RegaVAE, a retrieval-augmented language model built upon the variational auto-encoder (VAE)
It encodes the text corpus into a latent space, capturing current and future information from both source and target text.
Experimental results on various datasets demonstrate significant improvements in text generation quality and hallucination removal.
- Score: 79.56442336234221
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieval-augmented language models show promise in addressing issues like
outdated information and hallucinations in language models (LMs). However,
current research faces two main problems: 1) determining what information to
retrieve, and 2) effectively combining retrieved information during generation.
We argue that valuable retrieved information should not only be related to the
current source text but also consider the future target text, given the nature
of LMs that model future tokens. Moreover, we propose that aggregation using
latent variables derived from a compact latent space is more efficient than
utilizing explicit raw text, which is limited by context length and susceptible
to noise. Therefore, we introduce RegaVAE, a retrieval-augmented language model
built upon the variational auto-encoder (VAE). It encodes the text corpus into
a latent space, capturing current and future information from both source and
target text. Additionally, we leverage the VAE to initialize the latent space
and adopt the probabilistic form of the retrieval generation paradigm by
expanding the Gaussian prior distribution into a Gaussian mixture distribution.
Theoretical analysis provides an optimizable upper bound for RegaVAE.
Experimental results on various datasets demonstrate significant improvements
in text generation quality and hallucination removal.
Related papers
- QAEA-DR: A Unified Text Augmentation Framework for Dense Retrieval [12.225881591629815]
In dense retrieval, embedding long texts into dense vectors can result in information loss, leading to inaccurate query-text matching.
Recent studies mainly focus on improving the sentence embedding model or retrieval process.
We introduce a novel text augmentation framework for dense retrieval, which transforms raw documents into information-dense text formats.
arXiv Detail & Related papers (2024-07-29T17:39:08Z) - Optimizing Factual Accuracy in Text Generation through Dynamic Knowledge
Selection [71.20871905457174]
Language models (LMs) have revolutionized the way we interact with information, but they often generate nonfactual text.
Previous methods use external knowledge as references for text generation to enhance factuality but often struggle with the knowledge mix-up of irrelevant references.
We present DKGen, which divide the text generation process into an iterative process.
arXiv Detail & Related papers (2023-08-30T02:22:40Z) - Towards Computationally Verifiable Semantic Grounding for Language
Models [18.887697890538455]
The paper conceptualizes the LM as a conditional model generating text given a desired semantic message formalized as a set of entity-relationship triples.
It embeds the LM in an auto-encoder by feeding its output to a semantic fluency whose output is in the same representation domain as the input message.
We show that our proposed approaches significantly improve on the greedy search baseline.
arXiv Detail & Related papers (2022-11-16T17:35:52Z) - HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text
Extractive Summarization [57.798070356553936]
HETFORMER is a Transformer-based pre-trained model with multi-granularity sparse attentions for extractive summarization.
Experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1.
arXiv Detail & Related papers (2021-10-12T22:42:31Z) - Discrete Auto-regressive Variational Attention Models for Text Modeling [53.38382932162732]
Variational autoencoders (VAEs) have been widely applied for text modeling.
They are troubled by two challenges: information underrepresentation and posterior collapse.
We propose Discrete Auto-regressive Variational Attention Model (DAVAM) to address the challenges.
arXiv Detail & Related papers (2021-06-16T06:36:26Z) - Topical Language Generation using Transformers [4.795530213347874]
This paper presents a novel approach for Topical Language Generation (TLG) by combining a pre-trained LM with topic modeling information.
We extend our model by introducing new parameters and functions to influence the quantity of the topical features presented in the generated text.
Our experimental results demonstrate that our model outperforms the state-of-the-art results on coherency, diversity, and fluency while being faster in decoding.
arXiv Detail & Related papers (2021-03-11T03:45:24Z) - Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes.
An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences.
The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z) - AriEL: volume coding for sentence generation [5.972927416266617]
We improve on the performance of some of the standard methods in deep learning to generate sentences by uniformly sampling a continuous space.
We do it by proposing AriEL, that constructs volumes in a continuous space, without the need of encouraging the creation of volumes through the loss function.
Our results indicate that the random access to the stored information is dramatically improved, and our method AriEL is able to generate a wider variety of correct language by randomly sampling the latent space.
arXiv Detail & Related papers (2020-03-30T16:30:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.