Redefining Information Retrieval of Structured Database via Large Language Models
- URL: http://arxiv.org/abs/2405.05508v1
- Date: Thu, 9 May 2024 02:37:53 GMT
- Title: Redefining Information Retrieval of Structured Database via Large Language Models
- Authors: Mingzhu Wang, Yuzhe Zhang, Qihang Zhao, Juanyi Yang, Hong Zhang,
- Abstract summary: This paper introduces a novel retrieval augmentation framework called ChatLR.
It primarily employs the powerful semantic understanding ability of Large Language Models (LLMs) as retrievers to achieve precise and concise information retrieval.
Experimental results demonstrate the effectiveness of ChatLR in addressing user queries, achieving an overall information retrieval accuracy exceeding 98.8%.
- Score: 9.65171883231521
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieval augmentation is critical when Language Models (LMs) exploit non-parametric knowledge related to the query through external knowledge bases before reasoning. The retrieved information is incorporated into LMs as context alongside the query, enhancing the reliability of responses towards factual questions. Prior researches in retrieval augmentation typically follow a retriever-generator paradigm. In this context, traditional retrievers encounter challenges in precisely and seamlessly extracting query-relevant information from knowledge bases. To address this issue, this paper introduces a novel retrieval augmentation framework called ChatLR that primarily employs the powerful semantic understanding ability of Large Language Models (LLMs) as retrievers to achieve precise and concise information retrieval. Additionally, we construct an LLM-based search and question answering system tailored for the financial domain by fine-tuning LLM on two tasks including Text2API and API-ID recognition. Experimental results demonstrate the effectiveness of ChatLR in addressing user queries, achieving an overall information retrieval accuracy exceeding 98.8\%.
Related papers
- Retriever-and-Memory: Towards Adaptive Note-Enhanced Retrieval-Augmented Generation [72.70046559930555]
We propose a generic RAG approach called Adaptive Note-Enhanced RAG (Adaptive-Note) for complex QA tasks.
Specifically, Adaptive-Note introduces an overarching view of knowledge growth, iteratively gathering new information in the form of notes.
In addition, we employ an adaptive, note-based stop-exploration strategy to decide "what to retrieve and when to stop" to encourage sufficient knowledge exploration.
arXiv Detail & Related papers (2024-10-11T14:03:29Z) - Corpus-Steered Query Expansion with Large Language Models [35.64662397095323]
We introduce Corpus-Steered Query Expansion (CSQE) to promote the incorporation of knowledge embedded within the corpus.
CSQE utilizes the relevance assessing capability of LLMs to systematically identify pivotal sentences in the initially-retrieved documents.
Extensive experiments reveal that CSQE exhibits strong performance without necessitating any training, especially with queries for which LLMs lack knowledge.
arXiv Detail & Related papers (2024-02-28T03:58:58Z) - Context Matters: Pushing the Boundaries of Open-Ended Answer Generation with Graph-Structured Knowledge Context [4.1229332722825]
This paper introduces a novel framework that combines graph-driven context retrieval in conjunction to knowledge graphs based enhancement.
We conduct experiments on various Large Language Models (LLMs) with different parameter sizes to evaluate their ability to ground knowledge and determine factual accuracy in answers to open-ended questions.
Our methodology GraphContextGen consistently outperforms dominant text-based retrieval systems, demonstrating its robustness and adaptability to a larger number of use cases.
arXiv Detail & Related papers (2024-01-23T11:25:34Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - End-to-end Knowledge Retrieval with Multi-modal Queries [50.01264794081951]
ReMuQ requires a system to retrieve knowledge from a large corpus by integrating contents from both text and image queries.
We introduce a retriever model ReViz'' that can directly process input text and images to retrieve relevant knowledge in an end-to-end fashion.
We demonstrate superior performance in retrieval on two datasets under zero-shot settings.
arXiv Detail & Related papers (2023-06-01T08:04:12Z) - Query Rewriting for Retrieval-Augmented Large Language Models [139.242907155883]
Large Language Models (LLMs) play powerful, black-box readers in the retrieve-then-read pipeline.
This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs.
arXiv Detail & Related papers (2023-05-23T17:27:50Z) - Multi-Grained Knowledge Retrieval for End-to-End Task-Oriented Dialog [42.088274728084265]
Retrieving proper domain knowledge from an external database lies at the heart of end-to-end task-oriented dialog systems.
Most existing systems blend knowledge retrieval with response generation and optimize them with direct supervision from reference responses.
We propose to decouple knowledge retrieval from response generation and introduce a multi-grained knowledge retriever.
arXiv Detail & Related papers (2023-05-17T12:12:46Z) - Synergistic Interplay between Search and Large Language Models for
Information Retrieval [141.18083677333848]
InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections.
InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-12T11:58:15Z) - Query2doc: Query Expansion with Large Language Models [69.9707552694766]
The proposed method first generates pseudo- documents by few-shot prompting large language models (LLMs)
query2doc boosts the performance of BM25 by 3% to 15% on ad-hoc IR datasets.
Our method also benefits state-of-the-art dense retrievers in terms of both in-domain and out-of-domain results.
arXiv Detail & Related papers (2023-03-14T07:27:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.