Structural Entropy Guided Agent for Detecting and Repairing Knowledge Deficiencies in LLMs
- URL: http://arxiv.org/abs/2505.07184v1
- Date: Mon, 12 May 2025 02:21:36 GMT
- Title: Structural Entropy Guided Agent for Detecting and Repairing Knowledge Deficiencies in LLMs
- Authors: Yifan Wei, Xiaoyan Yu, Tengfei Pan, Angsheng Li, Li Du,
- Abstract summary: Large language models (LLMs) have achieved unprecedented performance by leveraging vast pretraining corpora.<n>Their performance remains suboptimal in knowledge-intensive domains such as medicine and scientific research.<n>We propose a novel Structural Entropy-guided Knowledge Navigator (SENATOR) framework that addresses the intrinsic knowledge deficiencies of LLMs.
- Score: 11.724887822269528
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have achieved unprecedented performance by leveraging vast pretraining corpora, yet their performance remains suboptimal in knowledge-intensive domains such as medicine and scientific research, where high factual precision is required. While synthetic data provides a promising avenue for augmenting domain knowledge, existing methods frequently generate redundant samples that do not align with the model's true knowledge gaps. To overcome this limitation, we propose a novel Structural Entropy-guided Knowledge Navigator (SENATOR) framework that addresses the intrinsic knowledge deficiencies of LLMs. Our approach employs the Structure Entropy (SE) metric to quantify uncertainty along knowledge graph paths and leverages Monte Carlo Tree Search (MCTS) to selectively explore regions where the model lacks domain-specific knowledge. Guided by these insights, the framework generates targeted synthetic data for supervised fine-tuning, enabling continuous self-improvement. Experimental results on LLaMA-3 and Qwen2 across multiple domain-specific benchmarks show that SENATOR effectively detects and repairs knowledge deficiencies, achieving notable performance improvements. The code and data for our methods and experiments are available at https://github.com/weiyifan1023/senator.
Related papers
- Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval [60.25608870901428]
Trustworthiness is a core research challenge for agentic AI systems built on Large Language Models (LLMs)<n>We propose the task of fact-checking without retrieval, focusing on the verification of arbitrary natural language claims, independent of their source robustness.
arXiv Detail & Related papers (2026-03-05T18:42:51Z) - Probing the Knowledge Boundary: An Interactive Agentic Framework for Deep Knowledge Extraction [29.717986496967978]
We propose an interactive agentic framework to systematically extract and quantify the knowledge of Large Language Models.<n>Our method includes four adaptive exploration policies to probe knowledge at different granularities.<n>We observe a clear knowledge scaling law, where larger models consistently extract more knowledge.
arXiv Detail & Related papers (2026-02-01T01:43:44Z) - Are We Evaluating the Edit Locality of LLM Model Editing Properly? [68.441768731381]
We find that existing specificity evaluation protocols are inadequate for this purpose.<n>Existing specificity metrics are weakly correlated with the strength of specificity regularizers.<n>We also find that current metrics lack sufficient sensitivity, rendering them ineffective at distinguishing the specificity performance of different methods.
arXiv Detail & Related papers (2026-01-24T07:07:21Z) - Empowering LLMs for Structure-Based Drug Design via Exploration-Augmented Latent Inference [5.052013621974765]
Large Language Models (LLMs) possess strong representation and reasoning capabilities, but their application to structure-based drug design (SBDD) is limited by insufficient understanding of protein structures and unpredictable molecular generation.<n>We propose Exploration-Augmented Latent Inference for LLMs (ELILLM), a framework that reinterprets the LLM generation process as an encoding, latent space exploration, and decoding workflow.<n>ELILLM explicitly explores portions of the design problem beyond the model's current knowledge while using a decoding module to handle familiar regions, generating chemically valid and synthetically reasonable molecules.
arXiv Detail & Related papers (2026-01-20T08:10:48Z) - Evontree: Ontology Rule-Guided Self-Evolution of Large Language Models [12.36467850170776]
Evontree is a novel framework that leverages a small set of high-quality rules to extract, validate, and enhance domain knowledge within large language models (LLMs)<n>Experiments on medical QA benchmarks with Llama3-8B-Instruct and Med42-v2 demonstrate consistent outperformance over both unmodified models and leading supervised baselines.
arXiv Detail & Related papers (2025-10-30T16:53:45Z) - Embedding Domain Knowledge for Large Language Models via Reinforcement Learning from Augmented Generation [18.99847259801634]
We propose Reinforcement Learning from Augmented Generation (RLAG) to embed domain knowledge into large language models.<n>Our approach iteratively cycles between sampling generations and optimize the model through calculated rewards.<n> Experimental results across medical, legal, astronomy, and current events datasets demonstrate that our proposed method significantly outperforms baseline approaches.
arXiv Detail & Related papers (2025-09-24T14:30:16Z) - Comparing Knowledge Injection Methods for LLMs in a Low-Resource Regime [13.230760040927496]
We investigate the task of injecting small, unstructured information into large language models.<n>We show that simply continuing pre-training on limited data yields modest improvements.<n>We shed light on the forgetting phenomenon in small-data regimes, illustrating the delicate balance between learning new content and retaining existing capabilities.
arXiv Detail & Related papers (2025-08-08T09:48:32Z) - PropMEND: Hypernetworks for Knowledge Propagation in LLMs [82.99849359892112]
We present a hypernetwork-based approach for knowledge propagation, named PropMEND.<n>We show almost 2x accuracy on challenging multi-hop questions whose answers are not explicitly stated in the injected fact.<n>We also introduce a new dataset, Controlled RippleEdit, to evaluate the generalization of our hypernetwork.
arXiv Detail & Related papers (2025-06-10T15:44:19Z) - Resolving Knowledge Conflicts in Domain-specific Data Selection: A Case Study on Medical Instruction-tuning [83.99974309930072]
Domain-specific instruction-tuning has become the defacto standard for improving the performance of large language models.<n>We propose a Knowledge-aware Data Selection framework to select the domain-specific instruction-tuning data that meets LLMs' actual needs.<n>By filtering the data with large knowledge conflicts and sampling the high-quality and diverse data, KDS can effectively stimulate the LLMs' abilities and achieve better domain-specific performance.
arXiv Detail & Related papers (2025-05-28T04:18:24Z) - Unveiling Knowledge Utilization Mechanisms in LLM-based Retrieval-Augmented Generation [77.10390725623125]
retrieval-augmented generation (RAG) is widely employed to expand their knowledge scope.<n>Since RAG has shown promise in knowledge-intensive tasks like open-domain question answering, its broader application to complex tasks and intelligent assistants has further advanced its utility.<n>We present a systematic investigation of the intrinsic mechanisms by which RAGs integrate internal (parametric) and external (retrieved) knowledge.
arXiv Detail & Related papers (2025-05-17T13:13:13Z) - Precise Localization of Memories: A Fine-grained Neuron-level Knowledge Editing Technique for LLMs [47.06544781855325]
We propose a Fine-grained Neuron-level Knowledge Editing (FiNE) method that enhances editing locality without affecting success rates.<n>By precisely identifying and modifying specific neurons within feed-forward networks, FiNE significantly improves knowledge localization and editing.
arXiv Detail & Related papers (2025-03-03T01:30:28Z) - Clear Minds Think Alike: What Makes LLM Fine-tuning Robust? A Study of Token Perplexity [61.48338027901318]
We show that fine-tuning with LLM-generated data improves target task performance and reduces out-of-domain degradation.<n>This is the first mechanistic explanation for the superior OOD robustness conferred by LLM-generated training data.
arXiv Detail & Related papers (2025-01-24T08:18:56Z) - Adapter-based Approaches to Knowledge-enhanced Language Models -- A Survey [48.52320309766703]
Knowledge-enhanced language models (KELMs) have emerged as promising tools to bridge the gap between large-scale language models and domain-specific knowledge.
KELMs can achieve higher factual accuracy and hallucinations by leveraging knowledge graphs (KGs)
arXiv Detail & Related papers (2024-11-25T14:10:24Z) - Knowledge Graph Structure as Prompt: Improving Small Language Models Capabilities for Knowledge-based Causal Discovery [10.573861741540853]
KG Structure as Prompt is a novel approach for integrating structural information from a knowledge graph, such as common neighbor nodes and metapaths, into prompt-based learning.
Experimental results on three types of biomedical and open-domain datasets under few-shot settings demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-07-26T14:07:00Z) - Injecting New Knowledge into Large Language Models via Supervised Fine-Tuning [13.371405067535814]
This paper investigates the effectiveness ofSupervised Fine-Tuning (SFT) as a method for knowledge injection in Large Language Models (LLMs)
We compare different dataset generation strategies -- token-based and fact-based scaling -- to create training data that helps the model learn new information.
Our results show considerable performance improvements in Q&A tasks related to out-of-domain knowledge.
arXiv Detail & Related papers (2024-03-30T01:56:07Z) - Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases [9.478012553728538]
We propose an end-to-end system design towards utilizing Retrieval Augmented Generation (RAG) to improve the factual accuracy of Large Language Models (LLMs)
Our system integrates RAG pipeline with upstream datasets processing and downstream performance evaluation.
Our experiments demonstrate the system's effectiveness in generating more accurate answers to domain-specific and time-sensitive inquiries.
arXiv Detail & Related papers (2024-03-15T16:30:14Z) - Pruning neural network models for gene regulatory dynamics using data and domain knowledge [24.670514977455202]
We propose DASH, a framework that guides network pruning by using domain-specific structural information in model fitting.
We show that DASH, using knowledge about gene interaction partners within the putative regulatory network, outperforms general pruning methods by a large margin.
arXiv Detail & Related papers (2024-03-05T23:02:55Z) - A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication.
This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches.
We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z) - Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective [106.92016199403042]
We empirically investigate knowledge transfer from larger to smaller models through a parametric perspective.
We employ sensitivity-based techniques to extract and align knowledge-specific parameters between different large language models.
Our findings highlight the critical factors contributing to the process of parametric knowledge transfer.
arXiv Detail & Related papers (2023-10-17T17:58:34Z) - Learning the Finer Things: Bayesian Structure Learning at the
Instantiation Level [0.0]
Successful machine learning methods require a trade-off between memorization and generalization.
We present a novel probabilistic graphical model structure learning approach that can learn, generalize and explain in elusive domains.
arXiv Detail & Related papers (2023-03-08T02:31:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.