RiTeK: A Dataset for Large Language Models Complex Reasoning over Textual Knowledge Graphs
- URL: http://arxiv.org/abs/2410.13987v1
- Date: Thu, 17 Oct 2024 19:33:37 GMT
- Title: RiTeK: A Dataset for Large Language Models Complex Reasoning over Textual Knowledge Graphs
- Authors: Jiatan Huang, Mingchen Li, Zonghai Yao, Zhichao Yang, Yongkang Xiao, Feiyun Ouyang, Xiaohan Li, Shuo Han, Hong Yu,
- Abstract summary: We develop a dataset for LLMs Complex Reasoning over Textual Knowledge Graphs (RiTeK) with a broad topological structure coverage.
We synthesize realistic user queries that integrate diverse topological structures, annotated information, and complex textual descriptions.
We introduce an enhanced Monte Carlo Tree Search (CTS) method, which automatically extracts relational path information from textual graphs for specific queries.
- Score: 12.846097618151951
- License:
- Abstract: Answering complex real-world questions often requires accurate retrieval from textual knowledge graphs (TKGs). The scarcity of annotated data, along with intricate topological structures, makes this task particularly challenging. As the nature of relational path information could enhance the inference ability of Large Language Models (LLMs), efficiently retrieving more complex relational path information from TKGs presents another key challenge. To tackle these challenges, we first develop a Dataset for LLMs Complex Reasoning over Textual Knowledge Graphs (RiTeK) with a broad topological structure coverage.We synthesize realistic user queries that integrate diverse topological structures, relational information, and complex textual descriptions. We conduct rigorous expert evaluation to validate the quality of our synthesized queries. And then, we introduce an enhanced Monte Carlo Tree Search (MCTS) method, Relational MCTS, to automatically extract relational path information from textual graphs for specific queries. Our dataset mainly covers the medical domain as the relation types and entity are complex and publicly available. Experimental results indicate that RiTeK poses significant challenges for current retrieval and LLM systems, while the proposed Relational MCTS method enhances LLM inference ability and achieves state-of-the-art performance on RiTeK.
Related papers
- ConTReGen: Context-driven Tree-structured Retrieval for Open-domain Long-form Text Generation [26.4086456393314]
Long-form text generation requires coherent, comprehensive responses that address complex queries with both breadth and depth.
Existing iterative retrieval-augmented generation approaches often struggle to delve deeply into each facet of complex queries.
This paper introduces ConTReGen, a novel framework that employs a context-driven, tree-structured retrieval approach.
arXiv Detail & Related papers (2024-10-20T21:17:05Z) - STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases [93.96463520716759]
We develop STARK, a large-scale Semi-structure retrieval benchmark on Textual and Knowledge Bases.
Our benchmark covers three domains: product search, academic paper search, and queries in precision medicine.
We design a novel pipeline to synthesize realistic user queries that integrate diverse relational information and complex textual properties.
arXiv Detail & Related papers (2024-04-19T22:54:54Z) - Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity [59.57065228857247]
Retrieval-augmented Large Language Models (LLMs) have emerged as a promising approach to enhancing response accuracy in several tasks, such as Question-Answering (QA)
We propose a novel adaptive QA framework, that can dynamically select the most suitable strategy for (retrieval-augmented) LLMs based on the query complexity.
We validate our model on a set of open-domain QA datasets, covering multiple query complexities, and show that ours enhances the overall efficiency and accuracy of QA systems.
arXiv Detail & Related papers (2024-03-21T13:52:30Z) - Meta Operator for Complex Query Answering on Knowledge Graphs [58.340159346749964]
We argue that different logical operator types, rather than the different complex query types, are the key to improving generalizability.
We propose a meta-learning algorithm to learn the meta-operators with limited data and adapt them to different instances of operators under various complex queries.
Empirical results show that learning meta-operators is more effective than learning original CQA or meta-CQA models.
arXiv Detail & Related papers (2024-03-15T08:54:25Z) - Type-based Neural Link Prediction Adapter for Complex Query Answering [2.1098688291287475]
We propose TypE-based Neural Link Prediction Adapter (TENLPA), a novel model that constructs type-based entity-relation graphs.
In order to effectively combine type information with complex logical queries, an adaptive learning mechanism is introduced.
Experiments on 3 standard datasets show that TENLPA model achieves state-of-the-art performance on complex query answering.
arXiv Detail & Related papers (2024-01-29T10:54:28Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - Can Large Language Models Understand Real-World Complex Instructions? [54.86632921036983]
Large language models (LLMs) can understand human instructions, but struggle with complex instructions.
Existing benchmarks are insufficient to assess LLMs' ability to understand complex instructions.
We propose CELLO, a benchmark for evaluating LLMs' ability to follow complex instructions systematically.
arXiv Detail & Related papers (2023-09-17T04:18:39Z) - Synergistic Interplay between Search and Large Language Models for
Information Retrieval [141.18083677333848]
InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections.
InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-12T11:58:15Z) - Complex Logical Reasoning over Knowledge Graphs using Large Language Models [13.594992599230277]
Reasoning over knowledge graphs (KGs) is a challenging task that requires a deep understanding of the relationships between entities.
Current approaches rely on learning geometries to embed entities in vector space for logical query operations.
We propose a novel decoupled approach, Language-guided Abstract Reasoning over Knowledge graphs (LARK), that formulates complex KG reasoning as a combination of contextual KG search and logical query reasoning.
arXiv Detail & Related papers (2023-05-02T02:21:49Z) - Query-Specific Knowledge Graphs for Complex Finance Topics [6.599344783327053]
We focus on the CODEC dataset, where domain experts create challenging questions.
We show that state-of-the-art ranking systems have headroom for improvement.
We demonstrate that entity and document relevance are positively correlated.
arXiv Detail & Related papers (2022-11-08T10:21:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.