Deep Learning-based Method for Expressing Knowledge Boundary of Black-Box LLM
- URL: http://arxiv.org/abs/2602.10801v1
- Date: Wed, 11 Feb 2026 12:42:59 GMT
- Title: Deep Learning-based Method for Expressing Knowledge Boundary of Black-Box LLM
- Authors: Haotian Sheng, Heyong Wang, Ming Hong, Hongman He, Junqiu Liu,
- Abstract summary: Large Language Models (LLMs) have achieved remarkable success, however, the emergence of content generation distortion (hallucination) limits their practical applications.<n>This paper proposes LSCL (LLM-Supervised Confidence Learning), a deep learning-based method for expressing the knowledge boundaries of black-box LLMs.
- Score: 5.711910452650628
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have achieved remarkable success, however, the emergence of content generation distortion (hallucination) limits their practical applications. The core cause of hallucination lies in LLMs' lack of awareness regarding their stored internal knowledge, preventing them from expressing their knowledge state on questions beyond their internal knowledge boundaries, as humans do. However, existing research on knowledge boundary expression primarily focuses on white-box LLMs, leaving methods suitable for black-box LLMs which offer only API access without revealing internal parameters-largely unexplored. Against this backdrop, this paper proposes LSCL (LLM-Supervised Confidence Learning), a deep learning-based method for expressing the knowledge boundaries of black-box LLMs. Based on the knowledge distillation framework, this method designs a deep learning model. Taking the input question, output answer, and token probability from a black-box LLM as inputs, it constructs a mapping between the inputs and the model' internal knowledge state, enabling the quantification and expression of the black-box LLM' knowledge boundaries. Experiments conducted on diverse public datasets and with multiple prominent black-box LLMs demonstrate that LSCL effectively assists black-box LLMs in accurately expressing their knowledge boundaries. It significantly outperforms existing baseline models on metrics such as accuracy and recall rate. Furthermore, considering scenarios where some black-box LLMs do not support access to token probability, an adaptive alternative method is proposed. The performance of this alternative approach is close to that of LSCL and surpasses baseline models.
Related papers
- Black-Box Membership Inference Attack for LVLMs via Prior Knowledge-Calibrated Memory Probing [25.68362027128315]
Large vision-language models (LVLMs) derive their capabilities from extensive training on vast corpora of visual and textual data.<n>We propose the first black-box MIA framework for LVLMs, based on a prior knowledge-calibrated memory probing mechanism.<n>Our method effectively identifies training data of LVLMs in a purely black-box setting and even achieves performance comparable to gray-box and white-box methods.
arXiv Detail & Related papers (2025-11-03T13:16:30Z) - On the Evolution of Federated Post-Training Large Language Models: A Model Accessibility View [82.19096285469115]
Federated Learning (FL) enables training models across decentralized data silos while preserving client data privacy.<n>Recent research has explored efficient methods for post-training large language models (LLMs) within FL to address computational and communication challenges.<n>An inference-only paradigm (black-box FedLLM) has emerged to address these limitations.
arXiv Detail & Related papers (2025-08-22T09:52:31Z) - Efficient Knowledge Probing of Large Language Models by Adapting Pre-trained Embeddings [27.08405655200845]
Large language models (LLMs) acquire knowledge across diverse domains such as science, history, and geography.<n>These methods require making forward passes through the underlying model to probe the LLM's knowledge about a specific fact.<n>We propose embedding models that effectively encode factual knowledge as text or graphs as proxies for LLMs.
arXiv Detail & Related papers (2025-08-08T05:32:31Z) - Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box Systems [16.995977750934887]
Large language models (LLM) learn to identify a black-box function from passively observed versus actively collected data.<n>We show that LLMs fail to extract information from observations, reaching a performance plateau that falls short of the ideal of Bayesian inference.<n>By providing the intervention data from one LLM to another, we show that this improvement is partly a result of engaging in the process of generating effective interventions.
arXiv Detail & Related papers (2025-05-23T14:37:36Z) - MoRE-LLM: Mixture of Rule Experts Guided by a Large Language Model [54.14155564592936]
We propose a Mixture of Rule Experts guided by a Large Language Model (MoRE-LLM)<n>MoRE-LLM steers the discovery of local rule-based surrogates during training and their utilization for the classification task.<n>LLM is responsible for enhancing the domain knowledge alignment of the rules by correcting and contextualizing them.
arXiv Detail & Related papers (2025-03-26T11:09:21Z) - LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization [59.75242204923353]
We introduce LLM-Lasso, a framework that leverages large language models (LLMs) to guide feature selection in Lasso regression.<n>LLMs generate penalty factors for each feature, which are converted into weights for the Lasso penalty using a simple, tunable model.<n>Features identified as more relevant by the LLM receive lower penalties, increasing their likelihood of being retained in the final model.
arXiv Detail & Related papers (2025-02-15T02:55:22Z) - SEUF: Is Unlearning One Expert Enough for Mixture-of-Experts LLMs? [35.237427998489785]
We propose a novel Selected-Expert Unlearning Framework (SEUF) for Mixture-of-Experts (MoE) LLMs.<n>Through expert attribution, unlearning is concentrated on the most actively engaged experts for the specified knowledge.<n>SEUF is compatible with various standard unlearning algorithms.
arXiv Detail & Related papers (2024-11-27T22:46:08Z) - Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs [60.40396361115776]
This paper introduces a novel collaborative approach, namely SlimPLM, that detects missing knowledge in large language models (LLMs) with a slim proxy model.
We employ a proxy model which has far fewer parameters, and take its answers as answers.
Heuristic answers are then utilized to predict the knowledge required to answer the user question, as well as the known and unknown knowledge within the LLM.
arXiv Detail & Related papers (2024-02-19T11:11:08Z) - Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation [109.8527403904657]
We show that large language models (LLMs) possess unwavering confidence in their knowledge and cannot handle the conflict between internal and external knowledge well.
Retrieval augmentation proves to be an effective approach in enhancing LLMs' awareness of knowledge boundaries.
We propose a simple method to dynamically utilize supporting documents with our judgement strategy.
arXiv Detail & Related papers (2023-07-20T16:46:10Z) - Augmented Large Language Models with Parametric Knowledge Guiding [72.71468058502228]
Large Language Models (LLMs) have significantly advanced natural language processing (NLP) with their impressive language understanding and generation capabilities.
Their performance may be suboptimal for domain-specific tasks that require specialized knowledge due to limited exposure to the related data.
We propose the novel Parametric Knowledge Guiding (PKG) framework, which equips LLMs with a knowledge-guiding module to access relevant knowledge.
arXiv Detail & Related papers (2023-05-08T15:05:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.