Related papers: Dynamic Retrieval-Augmented Generation

Dynamic Retrieval-Augmented Generation

URL: http://arxiv.org/abs/2312.08976v2
Date: Tue, 20 Feb 2024 13:44:13 GMT
Title: Dynamic Retrieval-Augmented Generation
Authors: Anton Shapkin, Denis Litvinov, Yaroslav Zharov, Egor Bogomolov, Timur Galimzyanov, Timofey Bryksin
Abstract summary: We propose a novel approach for the Dynamic Retrieval-Augmented Generation (DRAG) DRAG injects compressed embeddings of the retrieved entities into the generative model. Our approach achieves several targets: (1) lifting the length limitations of the context window, saving on the prompt size; (2) allowing huge expansion of the number of retrieval entities available for the context; (3) alleviating the problem of misspelling or failing to find relevant entity names.
Score: 4.741884506444161
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Current state-of-the-art large language models are effective in generating high-quality text and encapsulating a broad spectrum of world knowledge. These models, however, often hallucinate and lack locally relevant factual data. Retrieval-augmented approaches were introduced to overcome these problems and provide more accurate responses. Typically, the retrieved information is simply appended to the main request, restricting the context window size of the model. We propose a novel approach for the Dynamic Retrieval-Augmented Generation (DRAG), based on the entity-augmented generation, which injects compressed embeddings of the retrieved entities into the generative model. The proposed pipeline was developed for code-generation tasks, yet can be transferred to some domains of natural language processing. To train the model, we collect and publish a new project-level code generation dataset. We use it for the evaluation along with publicly available datasets. Our approach achieves several targets: (1) lifting the length limitations of the context window, saving on the prompt size; (2) allowing huge expansion of the number of retrieval entities available for the context; (3) alleviating the problem of misspelling or failing to find relevant entity names. This allows the model to beat all baselines (except GPT-3.5) with a strong margin.

Related papers

SitEmb-v1.5: Improved Context-Aware Dense Retrieval for Semantic Association and Long Story Comprehension [77.93156509994994]
We show how to represent short chunks in a way that is conditioned on a broader context window to enhance retrieval performance.<n>Existing embedding models are not well-equipped to encode such situated context effectively.<n>Our method substantially outperforms state-of-the-art embedding models.
arXiv Detail & Related papers (2025-08-03T23:59:31Z)
Cross-Domain Content Generation with Domain-Specific Small Language Models [3.2772349789781616]
This study explores methods to enable a small language model to produce coherent and relevant outputs for two different domains. We find that utilizing custom tokenizers tailored to each dataset significantly enhances generation quality. Our findings demonstrate that knowledge expansion with frozen layers is an effective method for small language models to generate domain-specific content.
arXiv Detail & Related papers (2024-09-19T21:45:13Z)
Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts [83.57864140378035]
This paper proposes a method to cover longer contexts in Open-Domain Question-Answering tasks. It leverages a small encoder language model that effectively encodes contexts, and the encoding applies cross-attention with origin inputs. After fine-tuning, there is improved performance across two held-in datasets, four held-out datasets, and also in two In Context Learning settings.
arXiv Detail & Related papers (2024-04-02T15:10:11Z)
EIGEN: Expert-Informed Joint Learning Aggregation for High-Fidelity Information Extraction from Document Images [27.36816896426097]
Information Extraction from document images is challenging due to the high variability of layout formats. We propose a novel approach, EIGEN, which combines rule-based methods with deep learning models using data programming approaches. We empirically show that our EIGEN framework can significantly improve the performance of state-of-the-art deep models with the availability of very few labeled data instances.
arXiv Detail & Related papers (2023-11-23T13:20:42Z)
Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy [164.83371924650294]
We show that strong performance can be achieved by a method we call Iter-RetGen, which synergizes retrieval and generation in an iterative manner. A model output shows what might be needed to finish a task, and thus provides an informative context for retrieving more relevant knowledge. Iter-RetGen processes all retrieved knowledge as a whole and largely preserves the flexibility in generation without structural constraints.
arXiv Detail & Related papers (2023-05-24T16:17:36Z)
Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning [153.98100182439165]
We introduce a Retrieval-augmented Visual Language Model, Re-ViLM, built upon the Flamingo. By storing certain knowledge explicitly in the external database, our approach reduces the number of model parameters. We demonstrate that Re-ViLM significantly boosts performance for image-to-text generation tasks.
arXiv Detail & Related papers (2023-02-09T18:57:56Z)
Automatic Context Pattern Generation for Entity Set Expansion [40.535332689515656]
We develop a module that automatically generates high-quality context patterns for entities. We also propose the GAPA framework that leverages the aforementioned GenerAted PAtterns to expand target entities.
arXiv Detail & Related papers (2022-07-17T06:50:35Z)
KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation [100.79870384880333]
We propose a knowledge-grounded pre-training (KGPT) to generate knowledge-enriched text. We adopt three settings, namely fully-supervised, zero-shot, few-shot to evaluate its effectiveness. Under zero-shot setting, our model achieves over 30 ROUGE-L on WebNLG while all other baselines fail.
arXiv Detail & Related papers (2020-10-05T19:59:05Z)
Partially-Aligned Data-to-Text Generation with Distant Supervision [69.15410325679635]
We propose a new generation task called Partially-Aligned Data-to-Text Generation (PADTG) It is more practical since it utilizes automatically annotated data for training and thus considerably expands the application domains. Our framework outperforms all baseline models as well as verify the feasibility of utilizing partially-aligned data.
arXiv Detail & Related papers (2020-10-03T03:18:52Z)
Interpretable Entity Representations through Large-Scale Typing [61.4277527871572]
We present an approach to creating entity representations that are human readable and achieve high performance out of the box. Our representations are vectors whose values correspond to posterior probabilities over fine-grained entity types. We show that it is possible to reduce the size of our type set in a learning-based way for particular domains.
arXiv Detail & Related papers (2020-04-30T23:58:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.