Related papers: StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

URL: http://arxiv.org/abs/2410.08815v2
Date: Fri, 25 Oct 2024 12:18:37 GMT
Title: StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
Authors: Zhuoqun Li, Xuanang Chen, Haiyang Yu, Hongyu Lin, Yaojie Lu, Qiaoyu Tang, Fei Huang, Xianpei Han, Le Sun, Yongbin Li,
Abstract summary: Retrieval-augmented generation (RAG) is a key means to effectively enhance large language models (LLMs) We propose StructRAG, which can identify the optimal structure type for the task at hand, reconstruct original documents into this structured format, and infer answers based on the resulting structure. Experiments show that StructRAG achieves state-of-the-art performance, particularly excelling in challenging scenarios.
Score: 94.31508613367296
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Retrieval-augmented generation (RAG) is a key means to effectively enhance large language models (LLMs) in many knowledge-based tasks. However, existing RAG methods struggle with knowledge-intensive reasoning tasks, because useful information required to these tasks are badly scattered. This characteristic makes it difficult for existing RAG methods to accurately identify key information and perform global reasoning with such noisy augmentation. In this paper, motivated by the cognitive theories that humans convert raw information into various structured knowledge when tackling knowledge-intensive reasoning, we proposes a new framework, StructRAG, which can identify the optimal structure type for the task at hand, reconstruct original documents into this structured format, and infer answers based on the resulting structure. Extensive experiments across various knowledge-intensive tasks show that StructRAG achieves state-of-the-art performance, particularly excelling in challenging scenarios, demonstrating its potential as an effective solution for enhancing LLMs in complex real-world applications.

Related papers

Context-Guided Dynamic Retrieval for Improving Generation Quality in RAG Models [2.9687381456164004]
It proposes a state-aware dynamic knowledge retrieval mechanism to enhance semantic understanding and knowledge scheduling efficiency. The proposed structure is thoroughly evaluated across different large models, including GPT-4, GPT-4o, and DeepSeek. The approach also demonstrates stronger robustness and generation consistency in tasks involving semantic ambiguity and multi-document fusion.
arXiv Detail & Related papers (2025-04-28T02:50:45Z)
Improving Multilingual Retrieval-Augmented Language Models through Dialectic Reasoning Argumentations [65.11348389219887]
We introduce Dialectic-RAG (DRAG), a modular approach that evaluates retrieved information by comparing, contrasting, and resolving conflicting perspectives. We show the impact of our framework both as an in-context learning strategy and for constructing demonstrations to instruct smaller models.
arXiv Detail & Related papers (2025-04-07T06:55:15Z)
RAG-KG-IL: A Multi-Agent Hybrid Framework for Reducing Hallucinations and Enhancing LLM Reasoning through RAG and Incremental Knowledge Graph Learning Integration [4.604003661048267]
RAG-KG-IL is a novel multi-agent hybrid framework designed to enhance the reasoning capabilities of Large Language Models. It integrates Retrieval-Augmented Generation (RAG) and Knowledge Graphs (KGs) with an Incremental Learning (IL) approach. We evaluate the framework using real-world case studies involving health-related queries.
arXiv Detail & Related papers (2025-03-14T11:50:16Z)
Enhancing Retrieval-Augmented Generation: A Study of Best Practices [16.246719783032436]
We develop advanced RAG system designs that incorporate query expansion, various novel retrieval strategies, and a novel Contrastive In-Context Learning RAG. Our study systematically investigates key factors, including language model size, prompt design, document chunk size, knowledge base size, retrieval stride, query expansion techniques, and Focus Mode retrieving relevant context at sentence-level. Our findings offer actionable insights for developing RAG systems, striking a balance between contextual richness and retrieval-generation efficiency.
arXiv Detail & Related papers (2025-01-13T15:07:55Z)
An Adaptive Framework for Generating Systematic Explanatory Answer in Online Q&A Platforms [62.878616839799776]
We propose SynthRAG, an innovative framework designed to enhance Question Answering (QA) performance. SynthRAG improves on conventional models by employing adaptive outlines for dynamic content structuring. An online deployment on the Zhihu platform revealed that SynthRAG's answers achieved notable user engagement.
arXiv Detail & Related papers (2024-10-23T09:14:57Z)
GIVE: Structured Reasoning with Knowledge Graph Inspired Veracity Extrapolation [108.2008975785364]
Graph Inspired Veracity Extrapolation (GIVE) is a novel reasoning framework that integrates the parametric and non-parametric memories. Our method facilitates a more logical and step-wise reasoning approach akin to experts' problem-solving, rather than gold answer retrieval.
arXiv Detail & Related papers (2024-10-11T03:05:06Z)
Reasoning Factual Knowledge in Structured Data with Large Language Models [26.00548862629018]
Large language models (LLMs) have made remarkable progress in various natural language processing tasks. Structured data possesses unique characteristics that differ from the unstructured texts used for pretraining. We propose a benchmark named StructFact to evaluate the structural reasoning capabilities of LLMs in inferring factual knowledge.
arXiv Detail & Related papers (2024-08-22T08:05:09Z)
BioRAG: A RAG-LLM Framework for Biological Question Reasoning [14.05505988436551]
We introduce BioRAG, a novel Retrieval-Augmented Generation (RAG) with the Large Language Models (LLMs) framework. Our approach starts with parsing, indexing, and segmenting an extensive collection of 22 million scientific papers as the basic knowledge, followed by training a specialized embedding model tailored to this domain. For queries requiring the most current information, BioRAGs deconstruct the question and employs an iterative retrieval process incorporated with the search engine for step-by-step reasoning.
arXiv Detail & Related papers (2024-08-02T08:37:03Z)
Mindful-RAG: A Study of Points of Failure in Retrieval Augmented Generation [11.471919529192048]
Large Language Models (LLMs) are proficient at generating coherent and contextually relevant text. Retrieval-augmented generation (RAG) systems mitigate this by incorporating external knowledge sources, such as structured knowledge graphs (KGs) Our study investigates this dilemma by analyzing error patterns in existing KG-based RAG methods and identifying eight critical failure points.
arXiv Detail & Related papers (2024-07-16T23:50:07Z)
Enhancing Question Answering for Enterprise Knowledge Bases using Large Language Models [46.51659135636255]
EKRG is a novel Retrieval-Generation framework based on large language models (LLMs) We introduce an instruction-tuning method using an LLM to generate sufficient document-question pairs for training a knowledge retriever. We develop a relevance-aware teacher-student learning strategy to further enhance the efficiency of the training process.
arXiv Detail & Related papers (2024-04-10T10:38:17Z)
Large Language Model-driven Meta-structure Discovery in Heterogeneous Information Network [29.149367323751413]
We propose ReStruct, a meta-structure search framework that integrates reasoning into the evolutionary procedure. We show that ReStruct achieves state-of-the-art performance in both recommendation and node classification tasks.
arXiv Detail & Related papers (2024-02-18T09:21:12Z)
A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches. We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z)
A Principled Framework for Knowledge-enhanced Large Language Model [58.1536118111993]
Large Language Models (LLMs) are versatile, yet they often falter in tasks requiring deep and reliable reasoning. This paper introduces a rigorously designed framework for creating LLMs that effectively anchor knowledge and employ a closed-loop reasoning process.
arXiv Detail & Related papers (2023-11-18T18:10:02Z)
LasUIE: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model [96.889634747943]
Universally modeling all typical information extraction tasks (UIE) with one generative language model (GLM) has revealed great potential. We propose a novel structure-aware GLM, fully unleashing the power of syntactic knowledge for UIE. Over 12 IE benchmarks across 7 tasks our system shows significant improvements over the baseline UIE system.
arXiv Detail & Related papers (2023-04-13T04:01:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.