CE-GOCD: Central Entity-Guided Graph Optimization for Community Detection to Augment LLM Scientific Question Answering
- URL: http://arxiv.org/abs/2601.21733v1
- Date: Thu, 29 Jan 2026 13:53:44 GMT
- Title: CE-GOCD: Central Entity-Guided Graph Optimization for Community Detection to Augment LLM Scientific Question Answering
- Authors: Jiayin Lan, Jiaqi Li, Baoxin Wang, Ming Liu, Dayong Wu, Shijin Wang, Bing Qin, Guoping Hu,
- Abstract summary: Large Language Models (LLMs) are increasingly used for question answering over scientific research papers.<n>Existing retrieval augmentation methods often rely on isolated text chunks or concepts, but overlook deeper semantic connections between papers.<n>We propose a method that augments LLMs' scientific question answering by explicitly modeling and leveraging semantic substructures within academic knowledge graphs.
- Score: 36.76110608580489
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) are increasingly used for question answering over scientific research papers. Existing retrieval augmentation methods often rely on isolated text chunks or concepts, but overlook deeper semantic connections between papers. This impairs the LLM's comprehension of scientific literature, hindering the comprehensiveness and specificity of its responses. To address this, we propose Central Entity-Guided Graph Optimization for Community Detection (CE-GOCD), a method that augments LLMs' scientific question answering by explicitly modeling and leveraging semantic substructures within academic knowledge graphs. Our approach operates by: (1) leveraging paper titles as central entities for targeted subgraph retrieval, (2) enhancing implicit semantic discovery via subgraph pruning and completion, and (3) applying community detection to distill coherent paper groups with shared themes. We evaluated the proposed method on three NLP literature-based question-answering datasets, and the results demonstrate its superiority over other retrieval-augmented baseline approaches, confirming the effectiveness of our framework.
Related papers
- Graph Your Way to Inspiration: Integrating Co-Author Graphs with Retrieval-Augmented Generation for Large Language Model Based Scientific Idea Generation [1.2232326171442904]
This paper proposes a scientific idea generation system called GYWI.<n>It combines author knowledge graphs with retrieval-augmented generation (RAG) to form an external knowledge base.<n>The generated ideas are evaluated from the following five dimensions: novelty, feasibility, clarity, relevance, and significance.
arXiv Detail & Related papers (2025-12-05T03:38:23Z) - Context-Aware Hierarchical Taxonomy Generation for Scientific Papers via LLM-Guided Multi-Aspect Clustering [59.54662810933882]
Existing taxonomy construction methods, leveraging unsupervised clustering or direct prompting of large language models, often lack coherence and granularity.<n>We propose a novel context-aware hierarchical taxonomy generation framework that integrates LLM-guided multi-aspect encoding with dynamic clustering.
arXiv Detail & Related papers (2025-09-23T15:12:58Z) - Topic-Guided Reinforcement Learning with LLMs for Enhancing Multi-Document Summarization [49.61589046694085]
We propose a topic-guided reinforcement learning approach to improve content selection in Multi-Document Summarization.<n>We first show that explicitly prompting models with topic labels enhances the informativeness of the generated summaries.
arXiv Detail & Related papers (2025-09-11T21:01:54Z) - Cross-Granularity Hypergraph Retrieval-Augmented Generation for Multi-hop Question Answering [49.43814054718318]
Multi-hop question answering (MHQA) requires integrating knowledge scattered across multiple passages to derive the correct answer.<n>Traditional retrieval-augmented generation (RAG) methods primarily focus on coarse-grained textual semantic similarity.<n>We propose a novel RAG approach called HGRAG for MHQA that achieves cross-granularity integration of structural and semantic information via hypergraphs.
arXiv Detail & Related papers (2025-08-15T06:36:13Z) - Question-Answer Extraction from Scientific Articles Using Knowledge Graphs and Large Language Models [1.8637078358591848]
We propose two distinct approaches for generating Question and Answer pairs from scientific articles.<n>The first approach involves selecting salient paragraphs, using a Large Language Model (LLM) to generate questions.<n>The second approach leverages a Knowledge Graph (KG) for QA generation.
arXiv Detail & Related papers (2025-07-18T11:31:52Z) - Scientific Paper Retrieval with LLM-Guided Semantic-Based Ranking [23.23119083861653]
SemRank is an effective and efficient paper retrieval framework.<n>It combines query understanding with a concept-based semantic index.<n> Experiments show that SemRank consistently improves the performance of various base retrievers.
arXiv Detail & Related papers (2025-05-27T22:49:18Z) - Harnessing Large Language Models for Knowledge Graph Question Answering via Adaptive Multi-Aspect Retrieval-Augmentation [81.18701211912779]
We introduce an Adaptive Multi-Aspect Retrieval-augmented over KGs (Amar) framework.<n>This method retrieves knowledge including entities, relations, and subgraphs, and converts each piece of retrieved text into prompt embeddings.<n>Our method has achieved state-of-the-art performance on two common datasets.
arXiv Detail & Related papers (2024-12-24T16:38:04Z) - Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation [51.8188846284153]
Attributed Text Generation (ATG) is proposed to enhance credibility and verifiability in RAG systems.<n>This paper proposes ReClaim, a fine-grained ATG method that alternates the generation of references and answers step by step.<n>With extensive experiments, we verify the effectiveness of ReClaim in extensive settings, achieving a citation accuracy rate of 90%.
arXiv Detail & Related papers (2024-07-01T20:47:47Z) - Robust Saliency-Aware Distillation for Few-shot Fine-grained Visual
Recognition [57.08108545219043]
Recognizing novel sub-categories with scarce samples is an essential and challenging research topic in computer vision.
Existing literature addresses this challenge by employing local-based representation approaches.
This article proposes a novel model, Robust Saliency-aware Distillation (RSaD), for few-shot fine-grained visual recognition.
arXiv Detail & Related papers (2023-05-12T00:13:17Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.