Generative Dense Retrieval: Memory Can Be a Burden
- URL: http://arxiv.org/abs/2401.10487v1
- Date: Fri, 19 Jan 2024 04:24:07 GMT
- Title: Generative Dense Retrieval: Memory Can Be a Burden
- Authors: Peiwen Yuan, Xinglin Wang, Shaoxiong Feng, Boyuan Pan, Yiwei Li, Heda
Wang, Xupeng Miao, Kan Li
- Abstract summary: Generative Retrieval (GR) autoregressively decodes relevant document identifiers given a query.
Dense Retrieval (DR) is introduced to conduct fine-grained intra-cluster matching from clusters to relevant documents.
DR obtains an average of 3.0 R@100 improvement on NQ dataset under multiple settings.
- Score: 16.964086245755798
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative Retrieval (GR), autoregressively decoding relevant document
identifiers given a query, has been shown to perform well under the setting of
small-scale corpora. By memorizing the document corpus with model parameters,
GR implicitly achieves deep interaction between query and document. However,
such a memorizing mechanism faces three drawbacks: (1) Poor memory accuracy for
fine-grained features of documents; (2) Memory confusion gets worse as the
corpus size increases; (3) Huge memory update costs for new documents. To
alleviate these problems, we propose the Generative Dense Retrieval (GDR)
paradigm. Specifically, GDR first uses the limited memory volume to achieve
inter-cluster matching from query to relevant document clusters.
Memorizing-free matching mechanism from Dense Retrieval (DR) is then introduced
to conduct fine-grained intra-cluster matching from clusters to relevant
documents. The coarse-to-fine process maximizes the advantages of GR's deep
interaction and DR's scalability. Besides, we design a cluster identifier
constructing strategy to facilitate corpus memory and a cluster-adaptive
negative sampling strategy to enhance the intra-cluster mapping ability.
Empirical results show that GDR obtains an average of 3.0 R@100 improvement on
NQ dataset under multiple settings and has better scalability.
Related papers
- Information-Theoretic Generative Clustering of Documents [24.56214029342293]
We present generative clustering (GC) for clustering a set of documents, $mathrmX$.
Because large language models (LLMs) provide probability distributions, the similarity between two documents can be rigorously defined.
We show GC achieves the state-of-the-art performance, outperforming any previous clustering method often by a large margin.
arXiv Detail & Related papers (2024-12-18T06:21:21Z) - Generative Retrieval Meets Multi-Graded Relevance [104.75244721442756]
We introduce a framework called GRaded Generative Retrieval (GR$2$)
GR$2$ focuses on two key components: ensuring relevant and distinct identifiers, and implementing multi-graded constrained contrastive training.
Experiments on datasets with both multi-graded and binary relevance demonstrate the effectiveness of GR$2$.
arXiv Detail & Related papers (2024-09-27T02:55:53Z) - MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery [24.38640001674072]
Retrieval-Augmented Generation (RAG) leverages retrieval tools to access external databases.
Existing RAG systems are primarily effective for straightforward question-answering tasks.
We propose MemoRAG, a novel retrieval-augmented generation paradigm empowered by long-term memory.
arXiv Detail & Related papers (2024-09-09T13:20:31Z) - ABCDE: Application-Based Cluster Diff Evals [49.1574468325115]
It aims to be practical: it allows items to have associated importance values that are application-specific, it is frugal in its use of human judgements when determining which clustering is better, and it can report metrics for arbitrary slices of items.
The approach to measuring the delta in the clustering quality is novel: instead of trying to construct an expensive ground truth up front and evaluating the each clustering with respect to that, ABCDE samples questions for judgement on the basis of the actual diffs between the clusterings.
arXiv Detail & Related papers (2024-07-31T08:29:35Z) - Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection [28.15184715270483]
Large language models (LLMs) augmented with retrieval exhibit robust performance and extensive versatility.
We propose a novel paradigm named Sparse RAG, which seeks to cut costs through sparsity.
Sparse RAG encodes retrieved documents in parallel, which eliminates latency introduced by long-range attention of retrieved documents.
arXiv Detail & Related papers (2024-05-25T11:10:04Z) - Corrective Retrieval Augmented Generation [36.04062963574603]
Retrieval-augmented generation (RAG) relies heavily on relevance of retrieved documents, raising concerns about how the model behaves if retrieval goes wrong.
We propose the Corrective Retrieval Augmented Generation (CRAG) to improve the robustness of generation.
CRAG is plug-and-play and can be seamlessly coupled with various RAG-based approaches.
arXiv Detail & Related papers (2024-01-29T04:36:39Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Lift Yourself Up: Retrieval-augmented Text Generation with Self Memory [72.36736686941671]
We propose a novel framework, selfmem, for improving retrieval-augmented generation models.
Selfmem iteratively employs a retrieval-augmented generator to create an unbounded memory pool and using a memory selector to choose one output as memory for the subsequent generation round.
We evaluate the effectiveness of selfmem on three distinct text generation tasks.
arXiv Detail & Related papers (2023-05-03T21:40:54Z) - Hybrid Inverted Index Is a Robust Accelerator for Dense Retrieval [25.402767809863946]
Inverted file structure is a common technique for accelerating dense retrieval.
In this work, we present the Hybrid Inverted Index (HI$2$), where the embedding clusters and salient terms work to accelerate dense retrieval.
arXiv Detail & Related papers (2022-10-11T15:12:41Z) - Hierarchical Memory Learning for Fine-Grained Scene Graph Generation [49.39355372599507]
This paper proposes a novel Hierarchical Memory Learning (HML) framework to learn the model from simple to complex.
After the autonomous partition of coarse and fine predicates, the model is first trained on the coarse predicates and then learns the fine predicates.
arXiv Detail & Related papers (2022-03-14T08:01:14Z) - Dual Cluster Contrastive learning for Person Re-Identification [78.42770787790532]
We formulate a unified cluster contrastive framework, named Dual Cluster Contrastive learning (DCC)
DCC maintains two types of memory banks: individual and centroid cluster memory banks.
It can be easily applied for unsupervised or supervised person ReID.
arXiv Detail & Related papers (2021-12-09T02:43:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.