Characterizing Truthfulness in Large Language Model Generations with
Local Intrinsic Dimension
- URL: http://arxiv.org/abs/2402.18048v1
- Date: Wed, 28 Feb 2024 04:56:21 GMT
- Title: Characterizing Truthfulness in Large Language Model Generations with
Local Intrinsic Dimension
- Authors: Fan Yin, Jayanth Srinivasa, Kai-Wei Chang
- Abstract summary: We study how to characterize and predict the truthfulness of texts generated from large language models (LLMs)
We suggest investigating internal activations and quantifying LLM's truthfulness using the local intrinsic dimension (LID) of model activations.
- Score: 63.330262740414646
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study how to characterize and predict the truthfulness of texts generated
from large language models (LLMs), which serves as a crucial step in building
trust between humans and LLMs. Although several approaches based on entropy or
verbalized uncertainty have been proposed to calibrate model predictions, these
methods are often intractable, sensitive to hyperparameters, and less reliable
when applied in generative tasks with LLMs. In this paper, we suggest
investigating internal activations and quantifying LLM's truthfulness using the
local intrinsic dimension (LID) of model activations. Through experiments on
four question answering (QA) datasets, we demonstrate the effectiveness
ohttps://info.arxiv.org/help/prep#abstractsf our proposed method. Additionally,
we study intrinsic dimensions in LLMs and their relations with model layers,
autoregressive language modeling, and the training of LLMs, revealing that
intrinsic dimensions can be a powerful approach to understanding LLMs.
Related papers
- CogSteer: Cognition-Inspired Selective Layer Intervention for Efficient Semantic Steering in Large Language Models [22.42235251921268]
We propose using eye movement measures to interpret large language models (LLMs) behavior across layers.
Inspired by these findings, we introduce a steering layer selection and apply it to layer intervention methods via fine-tuning and inference.
Our proposed CogSteer methods achieve better results in terms of toxicity scores while efficiently saving 97% of the computational resources and 60% of the training time.
arXiv Detail & Related papers (2024-10-23T09:40:15Z) - Zero-shot Model-based Reinforcement Learning using Large Language Models [12.930241182192988]
We investigate how pre-trained Large Language Models can be leveraged to predict in context the dynamics of continuous Markov decision processes.
We present proof-of-concept applications in two reinforcement learning settings: model-based policy evaluation and data-augmented off-policy reinforcement learning.
arXiv Detail & Related papers (2024-10-15T15:46:53Z) - Explaining Large Language Models Decisions with Shapley Values [1.223779595809275]
Large language models (LLMs) have opened up exciting possibilities for simulating human behavior and cognitive processes.
However, the validity of utilizing LLMs as stand-ins for human subjects remains uncertain.
This paper presents a novel approach based on Shapley values to interpret LLM behavior and quantify the relative contribution of each prompt component to the model's output.
arXiv Detail & Related papers (2024-03-29T22:49:43Z) - Towards Modeling Learner Performance with Large Language Models [7.002923425715133]
This paper investigates whether the pattern recognition and sequence modeling capabilities of LLMs can be extended to the domain of knowledge tracing.
We compare two approaches to using LLMs for this task, zero-shot prompting and model fine-tuning, with existing, non-LLM approaches to knowledge tracing.
While LLM-based approaches do not achieve state-of-the-art performance, fine-tuned LLMs surpass the performance of naive baseline models and perform on par with standard Bayesian Knowledge Tracing approaches.
arXiv Detail & Related papers (2024-02-29T14:06:34Z) - LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges.
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model.
This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z) - Knowledge Fusion of Large Language Models [73.28202188100646]
This paper introduces the notion of knowledge fusion for large language models (LLMs)
We externalize their collective knowledge and unique strengths, thereby elevating the capabilities of the target model beyond those of any individual source LLM.
Our findings confirm that the fusion of LLMs can improve the performance of the target model across a range of capabilities such as reasoning, commonsense, and code generation.
arXiv Detail & Related papers (2024-01-19T05:02:46Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - Improving Open Information Extraction with Large Language Models: A
Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text.
Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z) - Pareto Optimal Learning for Estimating Large Language Model Errors [12.21899680905672]
Large Language Models (LLMs) have shown impressive abilities in many applications.
We present a method that generates a risk score to estimate the probability of error in an LLM response by integrating multiple sources of information.
arXiv Detail & Related papers (2023-06-28T21:11:15Z) - Large Language Models Are Latent Variable Models: Explaining and Finding
Good Demonstrations for In-Context Learning [104.58874584354787]
In recent years, pre-trained large language models (LLMs) have demonstrated remarkable efficiency in achieving an inference-time few-shot learning capability known as in-context learning.
This study aims to examine the in-context learning phenomenon through a Bayesian lens, viewing real-world LLMs as latent variable models.
arXiv Detail & Related papers (2023-01-27T18:59:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.