Related papers: Structural Entropy Guided Agent for Detecting and Repairing Knowledge Deficiencies in LLMs

Structural Entropy Guided Agent for Detecting and Repairing Knowledge Deficiencies in LLMs

URL: http://arxiv.org/abs/2505.07184v1
Date: Mon, 12 May 2025 02:21:36 GMT
Title: Structural Entropy Guided Agent for Detecting and Repairing Knowledge Deficiencies in LLMs
Authors: Yifan Wei, Xiaoyan Yu, Tengfei Pan, Angsheng Li, Li Du,
Abstract summary: Large language models (LLMs) have achieved unprecedented performance by leveraging vast pretraining corpora.<n>Their performance remains suboptimal in knowledge-intensive domains such as medicine and scientific research.<n>We propose a novel Structural Entropy-guided Knowledge Navigator (SENATOR) framework that addresses the intrinsic knowledge deficiencies of LLMs.
Score: 11.724887822269528
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) have achieved unprecedented performance by leveraging vast pretraining corpora, yet their performance remains suboptimal in knowledge-intensive domains such as medicine and scientific research, where high factual precision is required. While synthetic data provides a promising avenue for augmenting domain knowledge, existing methods frequently generate redundant samples that do not align with the model's true knowledge gaps. To overcome this limitation, we propose a novel Structural Entropy-guided Knowledge Navigator (SENATOR) framework that addresses the intrinsic knowledge deficiencies of LLMs. Our approach employs the Structure Entropy (SE) metric to quantify uncertainty along knowledge graph paths and leverages Monte Carlo Tree Search (MCTS) to selectively explore regions where the model lacks domain-specific knowledge. Guided by these insights, the framework generates targeted synthetic data for supervised fine-tuning, enabling continuous self-improvement. Experimental results on LLaMA-3 and Qwen2 across multiple domain-specific benchmarks show that SENATOR effectively detects and repairs knowledge deficiencies, achieving notable performance improvements. The code and data for our methods and experiments are available at https://github.com/weiyifan1023/senator.

Related papers

PropMEND: Hypernetworks for Knowledge Propagation in LLMs [82.99849359892112]
We present a hypernetwork-based approach for knowledge propagation, named PropMEND.<n>We show almost 2x accuracy on challenging multi-hop questions whose answers are not explicitly stated in the injected fact.<n>We also introduce a new dataset, Controlled RippleEdit, to evaluate the generalization of our hypernetwork.
arXiv Detail & Related papers (2025-06-10T15:44:19Z)
Resolving Knowledge Conflicts in Domain-specific Data Selection: A Case Study on Medical Instruction-tuning [83.99974309930072]
Domain-specific instruction-tuning has become the defacto standard for improving the performance of large language models.<n>We propose a Knowledge-aware Data Selection framework to select the domain-specific instruction-tuning data that meets LLMs' actual needs.<n>By filtering the data with large knowledge conflicts and sampling the high-quality and diverse data, KDS can effectively stimulate the LLMs' abilities and achieve better domain-specific performance.
arXiv Detail & Related papers (2025-05-28T04:18:24Z)
Unveiling Knowledge Utilization Mechanisms in LLM-based Retrieval-Augmented Generation [77.10390725623125]
retrieval-augmented generation (RAG) is widely employed to expand their knowledge scope.<n>Since RAG has shown promise in knowledge-intensive tasks like open-domain question answering, its broader application to complex tasks and intelligent assistants has further advanced its utility.<n>We present a systematic investigation of the intrinsic mechanisms by which RAGs integrate internal (parametric) and external (retrieved) knowledge.
arXiv Detail & Related papers (2025-05-17T13:13:13Z)
Precise Localization of Memories: A Fine-grained Neuron-level Knowledge Editing Technique for LLMs [47.06544781855325]
We propose a Fine-grained Neuron-level Knowledge Editing (FiNE) method that enhances editing locality without affecting success rates.<n>By precisely identifying and modifying specific neurons within feed-forward networks, FiNE significantly improves knowledge localization and editing.
arXiv Detail & Related papers (2025-03-03T01:30:28Z)
Clear Minds Think Alike: What Makes LLM Fine-tuning Robust? A Study of Token Perplexity [61.48338027901318]
We show that fine-tuning with LLM-generated data improves target task performance and reduces out-of-domain degradation.<n>This is the first mechanistic explanation for the superior OOD robustness conferred by LLM-generated training data.
arXiv Detail & Related papers (2025-01-24T08:18:56Z)
Adapter-based Approaches to Knowledge-enhanced Language Models -- A Survey [48.52320309766703]
Knowledge-enhanced language models (KELMs) have emerged as promising tools to bridge the gap between large-scale language models and domain-specific knowledge. KELMs can achieve higher factual accuracy and hallucinations by leveraging knowledge graphs (KGs)
arXiv Detail & Related papers (2024-11-25T14:10:24Z)
Knowledge Graph Structure as Prompt: Improving Small Language Models Capabilities for Knowledge-based Causal Discovery [10.573861741540853]
KG Structure as Prompt is a novel approach for integrating structural information from a knowledge graph, such as common neighbor nodes and metapaths, into prompt-based learning. Experimental results on three types of biomedical and open-domain datasets under few-shot settings demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-07-26T14:07:00Z)
Injecting New Knowledge into Large Language Models via Supervised Fine-Tuning [13.371405067535814]
This paper investigates the effectiveness ofSupervised Fine-Tuning (SFT) as a method for knowledge injection in Large Language Models (LLMs) We compare different dataset generation strategies -- token-based and fact-based scaling -- to create training data that helps the model learn new information. Our results show considerable performance improvements in Q&A tasks related to out-of-domain knowledge.
arXiv Detail & Related papers (2024-03-30T01:56:07Z)
Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases [9.478012553728538]
We propose an end-to-end system design towards utilizing Retrieval Augmented Generation (RAG) to improve the factual accuracy of Large Language Models (LLMs) Our system integrates RAG pipeline with upstream datasets processing and downstream performance evaluation. Our experiments demonstrate the system's effectiveness in generating more accurate answers to domain-specific and time-sensitive inquiries.
arXiv Detail & Related papers (2024-03-15T16:30:14Z)
Pruning neural network models for gene regulatory dynamics using data and domain knowledge [24.670514977455202]
We propose DASH, a framework that guides network pruning by using domain-specific structural information in model fitting. We show that DASH, using knowledge about gene interaction partners within the putative regulatory network, outperforms general pruning methods by a large margin.
arXiv Detail & Related papers (2024-03-05T23:02:55Z)
A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches. We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z)
Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective [106.92016199403042]
We empirically investigate knowledge transfer from larger to smaller models through a parametric perspective. We employ sensitivity-based techniques to extract and align knowledge-specific parameters between different large language models. Our findings highlight the critical factors contributing to the process of parametric knowledge transfer.
arXiv Detail & Related papers (2023-10-17T17:58:34Z)
Learning the Finer Things: Bayesian Structure Learning at the Instantiation Level [0.0]
Successful machine learning methods require a trade-off between memorization and generalization. We present a novel probabilistic graphical model structure learning approach that can learn, generalize and explain in elusive domains.
arXiv Detail & Related papers (2023-03-08T02:31:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.