Fine-Tuning Language Models to Know What They Know
- URL: http://arxiv.org/abs/2602.02605v1
- Date: Mon, 02 Feb 2026 04:08:13 GMT
- Title: Fine-Tuning Language Models to Know What They Know
- Authors: Sangjun Park, Elliot Meyerson, Xin Qiu, Risto Miikkulainen,
- Abstract summary: This study proposes a framework to measure metacognitive ability $d_rmtype2'$ using a dual-prompt method.<n>It then introduces Evolution Strategy for Metacognitive Alignment (ESMA) to bind a model's internal knowledge to its explicit behaviors.<n>ESMA demonstrates robust generalization across diverse untrained settings, indicating a enhancement in the model's ability to reference its own knowledge.
- Score: 17.81468268125168
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Metacognition is a critical component of intelligence, specifically regarding the awareness of one's own knowledge. While humans rely on shared internal memory for both answering questions and reporting their knowledge state, this dependency in LLMs remains underexplored. This study proposes a framework to measure metacognitive ability $d_{\rm{type2}}'$ using a dual-prompt method, followed by the introduction of Evolution Strategy for Metacognitive Alignment (ESMA) to bind a model's internal knowledge to its explicit behaviors. ESMA demonstrates robust generalization across diverse untrained settings, indicating a enhancement in the model's ability to reference its own knowledge. Furthermore, parameter analysis attributes these improvements to a sparse set of significant modifications.
Related papers
- Know More, Know Clearer: A Meta-Cognitive Framework for Knowledge Augmentation in Large Language Models [80.21037538996553]
We propose a novel meta-cognitive framework for reliable knowledge augmentation via differentiated intervention and alignment.<n>Our approach leverages internal cognitive signals to partition the knowledge space into mastered, confused, and missing regions, guiding targeted knowledge expansion.<n>Our framework consistently outperforms strong baselines, validating its rationality in not only enhancing knowledge capabilities but also fostering cognitive behaviors that better distinguish knowns from unknowns.
arXiv Detail & Related papers (2026-02-13T15:07:35Z) - Adapting Like Humans: A Metacognitive Agent with Test-time Reasoning [38.92106966820126]
Recent Vision-Language Models (VLMs) exhibit strong perceptual reasoning abilities, yet they often struggle to adapt efficiently when encountering novel tasks at test time.<n>In contrast, humans leverage the metacognitive model with memory, enabling continuous strategy refinement through metacognitive control when faced with new challenges.<n>We propose metacognitive test-time reasoning (MCTR), a framework that equips models with the ability to learn, adapt, and improve during test time through metacognitive self-updating.
arXiv Detail & Related papers (2025-11-28T15:15:47Z) - Evidence for Limited Metacognition in LLMs [2.538209532048867]
We introduce a novel methodology for quantitatively evaluating metacognitive abilities in LLMs.<n>Taking inspiration from research on metacognition in nonhuman animals, our approach eschews model self-reports and instead tests to what degree models can strategically deploy knowledge of internal states.
arXiv Detail & Related papers (2025-09-25T20:30:15Z) - Towards Meta-Cognitive Knowledge Editing for Multimodal LLMs [71.8547241246169]
We introduce CogEdit, a novel benchmark designed to evaluate MLLMs' meta-cognitive knowledge editing abilities.<n>We propose MIND, a framework that constructs a meta-knowledge memory for self-awareness, employs game-theoretic interactions to monitor knowledge activation, and incorporates label refinement for noise-robust updates.
arXiv Detail & Related papers (2025-09-06T13:26:04Z) - Automatically Advancing LLM Expertise in Technology Judgment [1.1269582666887323]
Large language models (LLMs) are rapidly becoming core tools for science, engineering, and innovation.<n>Despite their impressive ability to answer increasingly difficult questions, it remains unclear whether LLMs truly use their knowledge when confronted with new and challenging tasks.<n>We evaluate a benchmark of 1.3 million post-2015 computer science patent pairs, characterized by dense technical jargon and strategically complex writing.<n>We find that LLMs often fail our benchmark and struggle to distinguish among semantically similar patents.
arXiv Detail & Related papers (2025-05-18T15:04:02Z) - Unveiling Knowledge Utilization Mechanisms in LLM-based Retrieval-Augmented Generation [77.10390725623125]
retrieval-augmented generation (RAG) is widely employed to expand their knowledge scope.<n>Since RAG has shown promise in knowledge-intensive tasks like open-domain question answering, its broader application to complex tasks and intelligent assistants has further advanced its utility.<n>We present a systematic investigation of the intrinsic mechanisms by which RAGs integrate internal (parametric) and external (retrieved) knowledge.
arXiv Detail & Related papers (2025-05-17T13:13:13Z) - Do Large Language Models Know How Much They Know? [26.09437131644674]
Large Language Models (LLMs) have emerged as highly capable systems.<n>A desired attribute of an intelligent system is its ability to recognize the scope of its own knowledge.<n>This benchmark evaluates whether the models recall excessive, insufficient, or the precise amount of information.
arXiv Detail & Related papers (2025-02-26T21:33:06Z) - How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training [92.88889953768455]
Large Language Models (LLMs) face a critical gap in understanding how they internalize new knowledge.<n>We identify computational subgraphs that facilitate knowledge storage and processing.
arXiv Detail & Related papers (2025-02-16T16:55:43Z) - Does Knowledge Localization Hold True? Surprising Differences Between Entity and Relation Perspectives in Language Models [20.157061521694096]
This study investigates the differences between entity and relational knowledge through knowledge editing.
To further elucidate the differences between entity and relational knowledge, we employ causal analysis to investigate how relational knowledge is stored in pre-trained models.
This insight highlights the multifaceted nature of knowledge storage in language models, underscoring the complexity of manipulating specific types of knowledge within these models.
arXiv Detail & Related papers (2024-09-01T05:09:11Z) - Knowledge Mechanisms in Large Language Models: A Survey and Perspective [88.51320482620679]
This paper reviews knowledge mechanism analysis from a novel taxonomy including knowledge utilization and evolution.<n>We discuss what knowledge LLMs have learned, the reasons for the fragility of parametric knowledge, and the potential dark knowledge (hypothesis) that will be challenging to address.
arXiv Detail & Related papers (2024-07-22T06:15:59Z) - A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication.
This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches.
We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.