Synergistic Interplay between Search and Large Language Models for
Information Retrieval
- URL: http://arxiv.org/abs/2305.07402v3
- Date: Tue, 12 Dec 2023 14:04:34 GMT
- Title: Synergistic Interplay between Search and Large Language Models for
Information Retrieval
- Authors: Jiazhan Feng, Chongyang Tao, Xiubo Geng, Tao Shen, Can Xu, Guodong
Long, Dongyan Zhao, Daxin Jiang
- Abstract summary: InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections.
InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods.
- Score: 141.18083677333848
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Information retrieval (IR) plays a crucial role in locating relevant
resources from vast amounts of data, and its applications have evolved from
traditional knowledge bases to modern retrieval models (RMs). The emergence of
large language models (LLMs) has further revolutionized the IR field by
enabling users to interact with search systems in natural languages. In this
paper, we explore the advantages and disadvantages of LLMs and RMs,
highlighting their respective strengths in understanding user-issued queries
and retrieving up-to-date information. To leverage the benefits of both
paradigms while circumventing their limitations, we propose InteR, a novel
framework that facilitates information refinement through synergy between RMs
and LLMs. InteR allows RMs to expand knowledge in queries using LLM-generated
knowledge collections and enables LLMs to enhance prompt formulation using
retrieved documents. This iterative refinement process augments the inputs of
RMs and LLMs, leading to more accurate retrieval. Experiments on large-scale
retrieval benchmarks involving web search and low-resource retrieval tasks
demonstrate that InteR achieves overall superior zero-shot retrieval
performance compared to state-of-the-art methods, even those using relevance
judgment. Source code is available at https://github.com/Cyril-JZ/InteR
Related papers
- DeepSieve: Information Sieving via LLM-as-a-Knowledge-Router [57.28685457991806]
DeepSieve is an agentic RAG framework that incorporates information sieving via LLM-as-a-knowledge-router.<n>Our design emphasizes modularity, transparency, and adaptability, leveraging recent advances in agentic system design.
arXiv Detail & Related papers (2025-07-29T17:55:23Z) - Context-Aware Scientific Knowledge Extraction on Linked Open Data using Large Language Models [0.0]
This paper introduces WISE (Workflow for Intelligent Scientific Knowledge Extraction), a system to extract, refine, and rank query-specific knowledge.<n>WISE delivers detailed, organized answers by systematically exploring and synthesizing knowledge from diverse sources.
arXiv Detail & Related papers (2025-06-21T04:22:34Z) - Large Language Models are Good Relational Learners [55.40941576497973]
We introduce Rel-LLM, a novel architecture that utilizes a graph neural network (GNN)- based encoder to generate structured relational prompts for large language models (LLMs)<n>Unlike traditional text-based serialization approaches, our method preserves the inherent relational structure of databases while enabling LLMs to process and reason over complex entity relationships.
arXiv Detail & Related papers (2025-06-06T04:07:55Z) - KnowTrace: Bootstrapping Iterative Retrieval-Augmented Generation with Structured Knowledge Tracing [64.38243807002878]
We present KnowTrace, an elegant RAG framework to mitigate the context overload in large language models.<n>KnowTrace autonomously traces out desired knowledge triplets to organize a specific knowledge graph relevant to the input question.<n>It consistently surpasses existing methods across three multi-hop question answering benchmarks.
arXiv Detail & Related papers (2025-05-26T17:22:20Z) - Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers [74.17516978246152]
Large language models (LLMs) have been widely integrated into information retrieval to advance traditional techniques.<n>We propose EXSEARCH, an agentic search framework, where the LLM learns to retrieve useful information as the reasoning unfolds.<n>Experiments on four knowledge-intensive benchmarks show that EXSEARCH substantially outperforms baselines.
arXiv Detail & Related papers (2025-05-26T15:27:55Z) - R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning [87.30285670315334]
textbfR1-Searcher is a novel two-stage outcome-based RL approach designed to enhance the search capabilities of Large Language Models.
Our framework relies exclusively on RL, without requiring process rewards or distillation for a cold start.
Our experiments demonstrate that our method significantly outperforms previous strong RAG methods, even when compared to the closed-source GPT-4o-mini.
arXiv Detail & Related papers (2025-03-07T17:14:44Z) - Invar-RAG: Invariant LLM-aligned Retrieval for Better Generation [43.630437906898635]
We propose a novel two-stage fine-tuning architecture called Invar-RAG.
In the retrieval stage, an LLM-based retriever is constructed by integrating LoRA-based representation learning.
In the generation stage, a refined fine-tuning method is employed to improve LLM accuracy in generating answers based on retrieved information.
arXiv Detail & Related papers (2024-11-11T14:25:37Z) - MM-Embed: Universal Multimodal Retrieval with Multimodal LLMs [78.5013630951288]
This paper introduces techniques for advancing information retrieval with multimodal large language models (MLLMs)
We first study fine-tuning an MLLM as a bi-encoder retriever on 10 datasets with 16 retrieval tasks.
We propose modality-aware hard negative mining to mitigate the modality bias exhibited by MLLM retrievers.
arXiv Detail & Related papers (2024-11-04T20:06:34Z) - Towards Enhancing Linked Data Retrieval in Conversational UIs using Large Language Models [1.3980986259786221]
This paper examines the integration of Large Language Models (LLMs) within existing systems.
By leveraging the advanced natural language understanding capabilities of LLMs, our method improves RDF entity extraction within web systems.
The evaluation of this methodology shows a marked enhancement in system expressivity and the accuracy of responses to user queries.
arXiv Detail & Related papers (2024-09-24T16:31:33Z) - Redefining Information Retrieval of Structured Database via Large Language Models [10.117751707641416]
This paper introduces a novel retrieval augmentation framework called ChatLR.
It primarily employs the powerful semantic understanding ability of Large Language Models (LLMs) as retrievers to achieve precise and concise information retrieval.
Experimental results demonstrate the effectiveness of ChatLR in addressing user queries, achieving an overall information retrieval accuracy exceeding 98.8%.
arXiv Detail & Related papers (2024-05-09T02:37:53Z) - Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation [128.01050030936028]
We propose an information refinement training method named InFO-RAG.
InFO-RAG is low-cost and general across various tasks.
It improves the performance of LLaMA2 by an average of 9.39% relative points.
arXiv Detail & Related papers (2024-02-28T08:24:38Z) - Self-Retrieval: End-to-End Information Retrieval with One Large Language Model [97.71181484082663]
We introduce Self-Retrieval, a novel end-to-end LLM-driven information retrieval architecture.
Self-Retrieval internalizes the retrieval corpus through self-supervised learning, transforms the retrieval process into sequential passage generation, and performs relevance assessment for reranking.
arXiv Detail & Related papers (2024-02-23T18:45:35Z) - ReSLLM: Large Language Models are Strong Resource Selectors for
Federated Search [35.44746116088232]
Federated search will become increasingly pivotal in the context of Retrieval-Augmented Generation pipelines.
Current SOTA resource selection methodologies rely on feature-based learning approaches.
We propose ReSLLM to drive the selection of resources in federated search in a zero-shot setting.
arXiv Detail & Related papers (2024-01-31T07:58:54Z) - Large Language Models for Information Retrieval: A Survey [58.30439850203101]
Information retrieval has evolved from term-based methods to its integration with advanced neural models.
Recent research has sought to leverage large language models (LLMs) to improve IR systems.
We delve into the confluence of LLMs and IR systems, including crucial aspects such as query rewriters, retrievers, rerankers, and readers.
arXiv Detail & Related papers (2023-08-14T12:47:22Z) - RRAML: Reinforced Retrieval Augmented Machine Learning [10.94680155282906]
We propose a novel framework called Reinforced Retrieval Augmented Machine Learning (RRAML)
RRAML integrates the reasoning capabilities of large language models with supporting information retrieved by a purpose-built retriever from a vast user-provided database.
We believe that the research agenda outlined in this paper has the potential to profoundly impact the field of AI.
arXiv Detail & Related papers (2023-07-24T13:51:19Z) - Search-in-the-Chain: Interactively Enhancing Large Language Models with
Search for Knowledge-intensive Tasks [121.74957524305283]
This paper proposes a novel framework named textbfSearch-in-the-Chain (SearChain) for the interaction between Information Retrieval (IR) and Large Language Model (LLM)
Experiments show that SearChain outperforms state-of-the-art baselines on complex knowledge-intensive tasks.
arXiv Detail & Related papers (2023-04-28T10:15:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.