Related papers: Do Large Language Models Know What They Don't Know?

Do Large Language Models Know What They Don't Know?

URL: http://arxiv.org/abs/2305.18153v2
Date: Tue, 30 May 2023 15:14:06 GMT
Title: Do Large Language Models Know What They Don't Know?
Authors: Zhangyue Yin, Qiushi Sun, Qipeng Guo, Jiawen Wu, Xipeng Qiu, Xuanjing Huang
Abstract summary: Large language models (LLMs) have a wealth of knowledge that allows them to excel in various Natural Language Processing (NLP) tasks. Despite their vast knowledge, LLMs are still limited by the amount of information they can accommodate and comprehend. This study aims to evaluate LLMs' self-knowledge by assessing their ability to identify unanswerable or unknowable questions.
Score: 74.65014158544011
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) have a wealth of knowledge that allows them to excel in various Natural Language Processing (NLP) tasks. Current research focuses on enhancing their performance within their existing knowledge. Despite their vast knowledge, LLMs are still limited by the amount of information they can accommodate and comprehend. Therefore, the ability to understand their own limitations on the unknows, referred to as self-knowledge, is of paramount importance. This study aims to evaluate LLMs' self-knowledge by assessing their ability to identify unanswerable or unknowable questions. We introduce an automated methodology to detect uncertainty in the responses of these models, providing a novel measure of their self-knowledge. We further introduce a unique dataset, SelfAware, consisting of unanswerable questions from five diverse categories and their answerable counterparts. Our extensive analysis, involving 20 LLMs including GPT-3, InstructGPT, and LLaMA, discovering an intrinsic capacity for self-knowledge within these models. Moreover, we demonstrate that in-context learning and instruction tuning can further enhance this self-knowledge. Despite this promising insight, our findings also highlight a considerable gap between the capabilities of these models and human proficiency in recognizing the limits of their knowledge.

Related papers

Introspective Growth: Automatically Advancing LLM Expertise in Technology Judgment [0.0]
Large language models (LLMs) increasingly demonstrate signs of conceptual understanding.<n>Much of their internal knowledge remains latent, loosely structured, and difficult to access or evaluate.<n>We propose self-questioning as a lightweight and scalable strategy to improve LLMs' understanding.
arXiv Detail & Related papers (2025-05-18T15:04:02Z)
Line of Duty: Evaluating LLM Self-Knowledge via Consistency in Feasibility Boundaries [0.0]
This study aims to obtain intrinsic insights into different types of LLM self-knowledge with a novel methodology. We find that even frontier models like GPT-4o and Mistral Large are not sure of their own capabilities more than 80% of the time.
arXiv Detail & Related papers (2025-03-14T10:07:07Z)
Do Large Language Models Know How Much They Know? [15.558423196651995]
Large Language Models (LLMs) have emerged as highly capable systems. A desired attribute of an intelligent system is its ability to recognize the scope of its own knowledge. This benchmark evaluates whether the models recall excessive, insufficient, or the precise amount of information.
arXiv Detail & Related papers (2025-02-26T21:33:06Z)
WisdomBot: Tuning Large Language Models with Artificial Intelligence Knowledge [17.74988145184004]
Large language models (LLMs) have emerged as powerful tools in natural language processing (NLP) This paper presents a novel LLM for education named WisdomBot, which combines the power of LLMs with educational theories. We introduce two key enhancements during inference, i.e., local knowledge base retrieval augmentation and search engine retrieval augmentation during inference.
arXiv Detail & Related papers (2025-01-22T13:36:46Z)
KaLM: Knowledge-aligned Autoregressive Language Modeling via Dual-view Knowledge Graph Contrastive Learning [74.21524111840652]
This paper proposes textbfKaLM, a textitKnowledge-aligned Language Modeling approach. It fine-tunes autoregressive large language models to align with KG knowledge via the joint objective of explicit knowledge alignment and implicit knowledge alignment. Notably, our method achieves a significant performance boost in evaluations of knowledge-driven tasks.
arXiv Detail & Related papers (2024-12-06T11:08:24Z)
Large Language Models are Limited in Out-of-Context Knowledge Reasoning [65.72847298578071]
Large Language Models (LLMs) possess extensive knowledge and strong capabilities in performing in-context reasoning. This paper focuses on a significant aspect of out-of-context reasoning: Out-of-Context Knowledge Reasoning (OCKR), which is to combine multiple knowledge to infer new knowledge.
arXiv Detail & Related papers (2024-06-11T15:58:59Z)
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models [83.5849717262019]
We propose a knowledge-aware fine-tuning (KnowTuning) method to improve fine-grained and coarse-grained knowledge awareness of LLMs. KnowTuning generates more facts with less factual error rate under fine-grained facts evaluation.
arXiv Detail & Related papers (2024-02-17T02:54:32Z)
Into the Unknown: Self-Learning Large Language Models [0.0]
We introduce a concept called Point in the Unknown (PiU) to identify atomic knowledge unknown to a model. We develop evaluation metrics to gauge an LLM's self-learning capability.
arXiv Detail & Related papers (2024-02-14T12:56:58Z)
RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge [69.79676144482792]
This study aims to evaluate the ability of LLMs to distinguish reliable information from external knowledge. Our benchmark consists of two tasks, Question Answering and Text Generation, and for each task, we provide models with a context containing counterfactual information.
arXiv Detail & Related papers (2023-11-14T13:24:19Z)
Learn to Refuse: Making Large Language Models More Controllable and Reliable through Knowledge Scope Limitation and Refusal Mechanism [0.0]
Large language models (LLMs) have demonstrated impressive language understanding and generation capabilities. These models are not flawless and often produce responses that contain errors or misinformation. We propose a refusal mechanism that instructs LLMs to refuse to answer challenging questions in order to avoid errors.
arXiv Detail & Related papers (2023-11-02T07:20:49Z)
Exploring the Cognitive Knowledge Structure of Large Language Models: An Educational Diagnostic Assessment Approach [50.125704610228254]
Large Language Models (LLMs) have not only exhibited exceptional performance across various tasks, but also demonstrated sparks of intelligence. Recent studies have focused on assessing their capabilities on human exams and revealed their impressive competence in different domains. We conduct an evaluation using MoocRadar, a meticulously annotated human test dataset based on Bloom taxonomy.
arXiv Detail & Related papers (2023-10-12T09:55:45Z)
Self-Knowledge Guided Retrieval Augmentation for Large Language Models [59.771098292611846]
Large language models (LLMs) have shown superior performance without task-specific fine-tuning. Retrieval-based methods can offer non-parametric world knowledge and improve the performance on tasks such as question answering. Self-Knowledge guided Retrieval augmentation (SKR) is a simple yet effective method which can let LLMs refer to the questions they have previously encountered.
arXiv Detail & Related papers (2023-10-08T04:22:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.