StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
- URL: http://arxiv.org/abs/2410.08815v2
- Date: Fri, 25 Oct 2024 12:18:37 GMT
- Title: StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
- Authors: Zhuoqun Li, Xuanang Chen, Haiyang Yu, Hongyu Lin, Yaojie Lu, Qiaoyu Tang, Fei Huang, Xianpei Han, Le Sun, Yongbin Li,
- Abstract summary: Retrieval-augmented generation (RAG) is a key means to effectively enhance large language models (LLMs)
We propose StructRAG, which can identify the optimal structure type for the task at hand, reconstruct original documents into this structured format, and infer answers based on the resulting structure.
Experiments show that StructRAG achieves state-of-the-art performance, particularly excelling in challenging scenarios.
- Score: 94.31508613367296
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieval-augmented generation (RAG) is a key means to effectively enhance large language models (LLMs) in many knowledge-based tasks. However, existing RAG methods struggle with knowledge-intensive reasoning tasks, because useful information required to these tasks are badly scattered. This characteristic makes it difficult for existing RAG methods to accurately identify key information and perform global reasoning with such noisy augmentation. In this paper, motivated by the cognitive theories that humans convert raw information into various structured knowledge when tackling knowledge-intensive reasoning, we proposes a new framework, StructRAG, which can identify the optimal structure type for the task at hand, reconstruct original documents into this structured format, and infer answers based on the resulting structure. Extensive experiments across various knowledge-intensive tasks show that StructRAG achieves state-of-the-art performance, particularly excelling in challenging scenarios, demonstrating its potential as an effective solution for enhancing LLMs in complex real-world applications.
Related papers
- An Adaptive Framework for Generating Systematic Explanatory Answer in Online Q&A Platforms [62.878616839799776]
We propose SynthRAG, an innovative framework designed to enhance Question Answering (QA) performance.
SynthRAG improves on conventional models by employing adaptive outlines for dynamic content structuring.
An online deployment on the Zhihu platform revealed that SynthRAG's answers achieved notable user engagement.
arXiv Detail & Related papers (2024-10-23T09:14:57Z) - GIVE: Structured Reasoning with Knowledge Graph Inspired Veracity Extrapolation [108.2008975785364]
Graph Inspired Veracity Extrapolation (GIVE) is a novel reasoning framework that integrates the parametric and non-parametric memories.
Our method facilitates a more logical and step-wise reasoning approach akin to experts' problem-solving, rather than gold answer retrieval.
arXiv Detail & Related papers (2024-10-11T03:05:06Z) - Reasoning Factual Knowledge in Structured Data with Large Language Models [26.00548862629018]
Large language models (LLMs) have made remarkable progress in various natural language processing tasks.
Structured data possesses unique characteristics that differ from the unstructured texts used for pretraining.
We propose a benchmark named StructFact to evaluate the structural reasoning capabilities of LLMs in inferring factual knowledge.
arXiv Detail & Related papers (2024-08-22T08:05:09Z) - BioRAG: A RAG-LLM Framework for Biological Question Reasoning [14.05505988436551]
We introduce BioRAG, a novel Retrieval-Augmented Generation (RAG) with the Large Language Models (LLMs) framework.
Our approach starts with parsing, indexing, and segmenting an extensive collection of 22 million scientific papers as the basic knowledge, followed by training a specialized embedding model tailored to this domain.
For queries requiring the most current information, BioRAGs deconstruct the question and employs an iterative retrieval process incorporated with the search engine for step-by-step reasoning.
arXiv Detail & Related papers (2024-08-02T08:37:03Z) - Enhancing Question Answering for Enterprise Knowledge Bases using Large Language Models [46.51659135636255]
EKRG is a novel Retrieval-Generation framework based on large language models (LLMs)
We introduce an instruction-tuning method using an LLM to generate sufficient document-question pairs for training a knowledge retriever.
We develop a relevance-aware teacher-student learning strategy to further enhance the efficiency of the training process.
arXiv Detail & Related papers (2024-04-10T10:38:17Z) - Large Language Model-driven Meta-structure Discovery in Heterogeneous Information Network [29.149367323751413]
We propose ReStruct, a meta-structure search framework that integrates reasoning into the evolutionary procedure.
We show that ReStruct achieves state-of-the-art performance in both recommendation and node classification tasks.
arXiv Detail & Related papers (2024-02-18T09:21:12Z) - A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication.
This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches.
We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z) - A Principled Framework for Knowledge-enhanced Large Language Model [58.1536118111993]
Large Language Models (LLMs) are versatile, yet they often falter in tasks requiring deep and reliable reasoning.
This paper introduces a rigorously designed framework for creating LLMs that effectively anchor knowledge and employ a closed-loop reasoning process.
arXiv Detail & Related papers (2023-11-18T18:10:02Z) - LasUIE: Unifying Information Extraction with Latent Adaptive
Structure-aware Generative Language Model [96.889634747943]
Universally modeling all typical information extraction tasks (UIE) with one generative language model (GLM) has revealed great potential.
We propose a novel structure-aware GLM, fully unleashing the power of syntactic knowledge for UIE.
Over 12 IE benchmarks across 7 tasks our system shows significant improvements over the baseline UIE system.
arXiv Detail & Related papers (2023-04-13T04:01:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.