A Survey of Generative Information Retrieval
- URL: http://arxiv.org/abs/2406.01197v2
- Date: Tue, 4 Jun 2024 04:12:39 GMT
- Title: A Survey of Generative Information Retrieval
- Authors: Tzu-Lin Kuo, Tzu-Wei Chiu, Tzung-Sheng Lin, Sheng-Yang Wu, Chao-Wei Huang, Yun-Nung Chen,
- Abstract summary: Generative Retrieval (GR) is an emerging paradigm in information retrieval that leverages generative models to map queries to relevant document identifiers (DocIDs) without the need for traditional query processing or document reranking.
This survey provides a comprehensive overview of GR, highlighting key developments, indexing and retrieval strategies, and challenges.
- Score: 25.1249210843116
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Generative Retrieval (GR) is an emerging paradigm in information retrieval that leverages generative models to directly map queries to relevant document identifiers (DocIDs) without the need for traditional query processing or document reranking. This survey provides a comprehensive overview of GR, highlighting key developments, indexing and retrieval strategies, and challenges. We discuss various document identifier strategies, including numerical and string-based identifiers, and explore different document representation methods. Our primary contribution lies in outlining future research directions that could profoundly impact the field: improving the quality of query generation, exploring learnable document identifiers, enhancing scalability, and integrating GR with multi-task learning frameworks. By examining state-of-the-art GR techniques and their applications, this survey aims to provide a foundational understanding of GR and inspire further innovations in this transformative approach to information retrieval. We also make the complementary materials such as paper collection publicly available at https://github.com/MiuLab/GenIR-Survey/
Related papers
- RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation [35.981443744108255]
We propose a novel RAG framework, namely RichRAG.
It includes a sub-aspect explorer to identify potential sub-aspects of input questions, a retriever to build a candidate pool of diverse external documents related to these sub-aspects, and a generative list-wise ranker.
Experimental results on two publicly available datasets prove that our framework effectively and efficiently provides comprehensive and satisfying responses to users.
arXiv Detail & Related papers (2024-06-18T12:52:51Z) - A Survey of Generative Search and Recommendation in the Era of Large Language Models [125.26354486027408]
generative search (retrieval) and recommendation aims to address the matching problem in a generative manner.
Superintelligent generative large language models have sparked a new paradigm in search and recommendation.
arXiv Detail & Related papers (2024-04-25T17:58:17Z) - From Matching to Generation: A Survey on Generative Information Retrieval [21.56093567336119]
generative information retrieval (GenIR) has emerged as a novel paradigm, gaining increasing attention in recent years.
This paper aims to systematically review the latest research progress in GenIR.
arXiv Detail & Related papers (2024-04-23T09:05:37Z) - A Survey on Retrieval-Augmented Text Generation for Large Language Models [1.4579344926652844]
Retrieval-Augmented Generation (RAG) merges retrieval methods with deep learning advancements.
This paper organizes the RAG paradigm into four categories: pre-retrieval, retrieval, post-retrieval, and generation.
It outlines RAG's evolution and discusses the field's progression through the analysis of significant studies.
arXiv Detail & Related papers (2024-04-17T01:27:42Z) - Large Language Models for Generative Information Extraction: A Survey [89.71273968283616]
Information extraction aims to extract structural knowledge from plain natural language texts.
generative Large Language Models (LLMs) have demonstrated remarkable capabilities in text understanding and generation.
LLMs offer viable solutions for IE tasks based on a generative paradigm.
arXiv Detail & Related papers (2023-12-29T14:25:22Z) - Retrieval-Augmented Generation for Large Language Models: A Survey [17.82361213043507]
Large Language Models (LLMs) showcase impressive capabilities but encounter challenges like hallucination.
Retrieval-Augmented Generation (RAG) has emerged as a promising solution by incorporating knowledge from external databases.
arXiv Detail & Related papers (2023-12-18T07:47:33Z) - Evaluating Generative Ad Hoc Information Retrieval [58.800799175084286]
generative retrieval systems often directly return a grounded generated text as a response to a query.
Quantifying the utility of the textual responses is essential for appropriately evaluating such generative ad hoc retrieval.
arXiv Detail & Related papers (2023-11-08T14:05:00Z) - Generate rather than Retrieve: Large Language Models are Strong Context
Generators [74.87021992611672]
We present a novel perspective for solving knowledge-intensive tasks by replacing document retrievers with large language model generators.
We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer.
arXiv Detail & Related papers (2022-09-21T01:30:59Z) - Layout-Aware Information Extraction for Document-Grounded Dialogue:
Dataset, Method and Demonstration [75.47708732473586]
We propose a layout-aware document-level Information Extraction dataset, LIE, to facilitate the study of extracting both structural and semantic knowledge from visually rich documents.
LIE contains 62k annotations of three extraction tasks from 4,061 pages in product and official documents.
Empirical results show that layout is critical for VRD-based extraction, and system demonstration also verifies that the extracted knowledge can help locate the answers that users care about.
arXiv Detail & Related papers (2022-07-14T07:59:45Z) - Retrieval-Enhanced Machine Learning [110.5237983180089]
We describe a generic retrieval-enhanced machine learning framework, which includes a number of existing models as special cases.
REML challenges information retrieval conventions, presenting opportunities for novel advances in core areas, including optimization.
REML research agenda lays a foundation for a new style of information access research and paves a path towards advancing machine learning and artificial intelligence.
arXiv Detail & Related papers (2022-05-02T21:42:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.