Dynamic Retrieval-Augmented Generation
- URL: http://arxiv.org/abs/2312.08976v2
- Date: Tue, 20 Feb 2024 13:44:13 GMT
- Title: Dynamic Retrieval-Augmented Generation
- Authors: Anton Shapkin, Denis Litvinov, Yaroslav Zharov, Egor Bogomolov, Timur
Galimzyanov, Timofey Bryksin
- Abstract summary: We propose a novel approach for the Dynamic Retrieval-Augmented Generation (DRAG)
DRAG injects compressed embeddings of the retrieved entities into the generative model.
Our approach achieves several targets: (1) lifting the length limitations of the context window, saving on the prompt size; (2) allowing huge expansion of the number of retrieval entities available for the context; (3) alleviating the problem of misspelling or failing to find relevant entity names.
- Score: 4.741884506444161
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current state-of-the-art large language models are effective in generating
high-quality text and encapsulating a broad spectrum of world knowledge. These
models, however, often hallucinate and lack locally relevant factual data.
Retrieval-augmented approaches were introduced to overcome these problems and
provide more accurate responses. Typically, the retrieved information is simply
appended to the main request, restricting the context window size of the model.
We propose a novel approach for the Dynamic Retrieval-Augmented Generation
(DRAG), based on the entity-augmented generation, which injects compressed
embeddings of the retrieved entities into the generative model. The proposed
pipeline was developed for code-generation tasks, yet can be transferred to
some domains of natural language processing. To train the model, we collect and
publish a new project-level code generation dataset. We use it for the
evaluation along with publicly available datasets. Our approach achieves
several targets: (1) lifting the length limitations of the context window,
saving on the prompt size; (2) allowing huge expansion of the number of
retrieval entities available for the context; (3) alleviating the problem of
misspelling or failing to find relevant entity names. This allows the model to
beat all baselines (except GPT-3.5) with a strong margin.
Related papers
- Cross-Domain Content Generation with Domain-Specific Small Language Models [3.2772349789781616]
This study explores methods to enable a small language model to produce coherent and relevant outputs for two different domains.
We find that utilizing custom tokenizers tailored to each dataset significantly enhances generation quality.
Our findings demonstrate that knowledge expansion with frozen layers is an effective method for small language models to generate domain-specific content.
arXiv Detail & Related papers (2024-09-19T21:45:13Z) - Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts [83.57864140378035]
This paper proposes a method to cover longer contexts in Open-Domain Question-Answering tasks.
It leverages a small encoder language model that effectively encodes contexts, and the encoding applies cross-attention with origin inputs.
After fine-tuning, there is improved performance across two held-in datasets, four held-out datasets, and also in two In Context Learning settings.
arXiv Detail & Related papers (2024-04-02T15:10:11Z) - EIGEN: Expert-Informed Joint Learning Aggregation for High-Fidelity
Information Extraction from Document Images [27.36816896426097]
Information Extraction from document images is challenging due to the high variability of layout formats.
We propose a novel approach, EIGEN, which combines rule-based methods with deep learning models using data programming approaches.
We empirically show that our EIGEN framework can significantly improve the performance of state-of-the-art deep models with the availability of very few labeled data instances.
arXiv Detail & Related papers (2023-11-23T13:20:42Z) - Enhancing Retrieval-Augmented Large Language Models with Iterative
Retrieval-Generation Synergy [164.83371924650294]
We show that strong performance can be achieved by a method we call Iter-RetGen, which synergizes retrieval and generation in an iterative manner.
A model output shows what might be needed to finish a task, and thus provides an informative context for retrieving more relevant knowledge.
Iter-RetGen processes all retrieved knowledge as a whole and largely preserves the flexibility in generation without structural constraints.
arXiv Detail & Related papers (2023-05-24T16:17:36Z) - Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot
Image Captioning [153.98100182439165]
We introduce a Retrieval-augmented Visual Language Model, Re-ViLM, built upon the Flamingo.
By storing certain knowledge explicitly in the external database, our approach reduces the number of model parameters.
We demonstrate that Re-ViLM significantly boosts performance for image-to-text generation tasks.
arXiv Detail & Related papers (2023-02-09T18:57:56Z) - Automatic Context Pattern Generation for Entity Set Expansion [40.535332689515656]
We develop a module that automatically generates high-quality context patterns for entities.
We also propose the GAPA framework that leverages the aforementioned GenerAted PAtterns to expand target entities.
arXiv Detail & Related papers (2022-07-17T06:50:35Z) - KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation [100.79870384880333]
We propose a knowledge-grounded pre-training (KGPT) to generate knowledge-enriched text.
We adopt three settings, namely fully-supervised, zero-shot, few-shot to evaluate its effectiveness.
Under zero-shot setting, our model achieves over 30 ROUGE-L on WebNLG while all other baselines fail.
arXiv Detail & Related papers (2020-10-05T19:59:05Z) - Partially-Aligned Data-to-Text Generation with Distant Supervision [69.15410325679635]
We propose a new generation task called Partially-Aligned Data-to-Text Generation (PADTG)
It is more practical since it utilizes automatically annotated data for training and thus considerably expands the application domains.
Our framework outperforms all baseline models as well as verify the feasibility of utilizing partially-aligned data.
arXiv Detail & Related papers (2020-10-03T03:18:52Z) - Interpretable Entity Representations through Large-Scale Typing [61.4277527871572]
We present an approach to creating entity representations that are human readable and achieve high performance out of the box.
Our representations are vectors whose values correspond to posterior probabilities over fine-grained entity types.
We show that it is possible to reduce the size of our type set in a learning-based way for particular domains.
arXiv Detail & Related papers (2020-04-30T23:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.