Challenges with unsupervised LLM knowledge discovery
- URL: http://arxiv.org/abs/2312.10029v2
- Date: Mon, 18 Dec 2023 16:43:35 GMT
- Title: Challenges with unsupervised LLM knowledge discovery
- Authors: Sebastian Farquhar, Vikrant Varma, Zachary Kenton, Johannes Gasteiger,
Vladimir Mikulik, Rohin Shah
- Abstract summary: We show that existing unsupervised methods on large language model (LLM) activations do not discover knowledge.
The idea behind unsupervised knowledge elicitation is that knowledge satisfies a consistency structure, which can be used to discover knowledge.
- Score: 15.816138136030705
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We show that existing unsupervised methods on large language model (LLM)
activations do not discover knowledge -- instead they seem to discover whatever
feature of the activations is most prominent. The idea behind unsupervised
knowledge elicitation is that knowledge satisfies a consistency structure,
which can be used to discover knowledge. We first prove theoretically that
arbitrary features (not just knowledge) satisfy the consistency structure of a
particular leading unsupervised knowledge-elicitation method,
contrast-consistent search (Burns et al. - arXiv:2212.03827). We then present a
series of experiments showing settings in which unsupervised methods result in
classifiers that do not predict knowledge, but instead predict a different
prominent feature. We conclude that existing unsupervised methods for
discovering latent knowledge are insufficient, and we contribute sanity checks
to apply to evaluating future knowledge elicitation methods. Conceptually, we
hypothesise that the identification issues explored here, e.g. distinguishing a
model's knowledge from that of a simulated character's, will persist for future
unsupervised methods.
Related papers
- FaithUn: Toward Faithful Forgetting in Language Models by Investigating the Interconnectedness of Knowledge [24.858928681280634]
We define a new concept called superficial unlearning, which refers to the phenomenon where an unlearning method fails to erase interconnected knowledge.
Based on the definition, we introduce a new benchmark, FaithUn, to analyze and evaluate the faithfulness of unlearning in real-world knowledge QA settings.
We propose a novel unlearning method, KLUE, which updates only knowledge-related neurons to achieve faithful unlearning.
arXiv Detail & Related papers (2025-02-26T15:11:03Z) - Knowledge Discovery using Unsupervised Cognition [2.6563873893593826]
Unsupervised Cognition is a novel unsupervised learning algorithm that focus on modelling the learned data.
This paper presents three techniques to perform knowledge discovery over an already trained Unsupervised Cognition model.
arXiv Detail & Related papers (2024-09-30T08:07:29Z) - Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs [55.317267269115845]
Chain-of-Knowledge (CoK) is a comprehensive framework for knowledge reasoning.
CoK includes methodologies for both dataset construction and model learning.
We conduct extensive experiments with KnowReason.
arXiv Detail & Related papers (2024-06-30T10:49:32Z) - Deciphering Raw Data in Neuro-Symbolic Learning with Provable Guarantees [17.58485742162185]
Neuro-symbolic hybrid systems are promising for integrating machine learning and symbolic reasoning.
It remains unclear why a hybrid system succeeds for a specific task and when it may fail given a different knowledge base.
We introduce a novel way of characterising supervision signals from a knowledge base, and establish a criterion for determining the knowledge's efficacy in facilitating successful learning.
arXiv Detail & Related papers (2023-08-21T06:04:53Z) - UNTER: A Unified Knowledge Interface for Enhancing Pre-trained Language
Models [100.4659557650775]
We propose a UNified knowledge inTERface, UNTER, to provide a unified perspective to exploit both structured knowledge and unstructured knowledge.
With both forms of knowledge injected, UNTER gains continuous improvements on a series of knowledge-driven NLP tasks.
arXiv Detail & Related papers (2023-05-02T17:33:28Z) - Knowledge-augmented Deep Learning and Its Applications: A Survey [60.221292040710885]
knowledge-augmented deep learning (KADL) aims to identify domain knowledge and integrate it into deep models for data-efficient, generalizable, and interpretable deep learning.
This survey subsumes existing works and offers a bird's-eye view of research in the general area of knowledge-augmented deep learning.
arXiv Detail & Related papers (2022-11-30T03:44:15Z) - Eliciting Knowledge from Large Pre-Trained Models for Unsupervised
Knowledge-Grounded Conversation [45.95864432188745]
Recent advances in large-scale pre-training provide large models with the potential to learn knowledge from the raw text.
We propose various methods that best elicit knowledge from large models.
Our human study indicates that, though hallucinations exist, large models post the unique advantage of being able to output common sense.
arXiv Detail & Related papers (2022-11-03T04:48:38Z) - Causal Imitation Learning with Unobserved Confounders [82.22545916247269]
We study imitation learning when sensory inputs of the learner and the expert differ.
We show that imitation could still be feasible by exploiting quantitative knowledge of the expert trajectories.
arXiv Detail & Related papers (2022-08-12T13:29:53Z) - A Unified End-to-End Retriever-Reader Framework for Knowledge-based VQA [67.75989848202343]
This paper presents a unified end-to-end retriever-reader framework towards knowledge-based VQA.
We shed light on the multi-modal implicit knowledge from vision-language pre-training models to mine its potential in knowledge reasoning.
Our scheme is able to not only provide guidance for knowledge retrieval, but also drop these instances potentially error-prone towards question answering.
arXiv Detail & Related papers (2022-06-30T02:35:04Z) - Exploratory Machine Learning with Unknown Unknowns [60.78953456742171]
We study a new problem setting in which there are unknown classes in the training data misperceived as other labels.
We propose the exploratory machine learning, which examines and investigates training data by actively augmenting the feature space to discover potentially hidden classes.
arXiv Detail & Related papers (2020-02-05T02:06:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.