Related papers: Frame Representation Hypothesis: Multi-Token LLM Interpretability and Concept-Guided Text Generation

Frame Representation Hypothesis: Multi-Token LLM Interpretability and Concept-Guided Text Generation

URL: http://arxiv.org/abs/2412.07334v2
Date: Thu, 12 Dec 2024 05:01:04 GMT
Title: Frame Representation Hypothesis: Multi-Token LLM Interpretability and Concept-Guided Text Generation
Authors: Pedro H. V. Valois, Lincon S. Souza, Erica K. Shimomoto, Kazuhiro Fukui,
Abstract summary: Interpretability is a key challenge in fostering trust for Large Language Models.<n>We present the Frame Representation Hypothesis to interpret and control LLMs by modeling multi-token words.<n>We showcase these tools through Top-k Concept-Guided Decoding, which can intuitively steer text generation using concepts of choice.
Score: 6.356639602091336
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Interpretability is a key challenge in fostering trust for Large Language Models (LLMs), which stems from the complexity of extracting reasoning from model's parameters. We present the Frame Representation Hypothesis, a theoretically robust framework grounded in the Linear Representation Hypothesis (LRH) to interpret and control LLMs by modeling multi-token words. Prior research explored LRH to connect LLM representations with linguistic concepts, but was limited to single token analysis. As most words are composed of several tokens, we extend LRH to multi-token words, thereby enabling usage on any textual data with thousands of concepts. To this end, we propose words can be interpreted as frames, ordered sequences of vectors that better capture token-word relationships. Then, concepts can be represented as the average of word frames sharing a common concept. We showcase these tools through Top-k Concept-Guided Decoding, which can intuitively steer text generation using concepts of choice. We verify said ideas on Llama 3.1, Gemma 2, and Phi 3 families, demonstrating gender and language biases, exposing harmful content, but also potential to remediate them, leading to safer and more transparent LLMs. Code is available at https://github.com/phvv-me/frame-representation-hypothesis.git

Related papers

Segment First or Comprehend First? Explore the Limit of Unsupervised Word Segmentation with Large Language Models [92.92512796044471]
We propose a new framework to explore the limit of unsupervised word segmentation with Large Language Models (LLMs)<n>We employ current mainstream LLMs to perform word segmentation across multiple languages to assess LLMs' "comprehension"<n>We introduce a novel unsupervised method, termed LLACA, which enables the construction of a dynamic $n$-gram model that adjusts based on contextual information.
arXiv Detail & Related papers (2025-05-26T07:48:15Z)
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching [60.04718679054704]
We introduce Sketch-of-Thought (SoT), a novel prompting framework. It combines cognitive-inspired reasoning paradigms with linguistic constraints to minimize token usage. SoT achieves token reductions of 76% with negligible accuracy impact.
arXiv Detail & Related papers (2025-03-07T06:57:17Z)
SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization [70.11167263638562]
Social relation reasoning aims to identify relation categories such as friends, spouses, and colleagues from images. We first present a simple yet well-crafted framework named name, which combines the perception capability of Vision Foundation Models (VFMs) and the reasoning capability of Large Language Models (LLMs) within a modular framework.
arXiv Detail & Related papers (2024-10-28T18:10:26Z)
Language Models as Semiotic Machines: Reconceptualizing AI Language Systems through Structuralist and Post-Structuralist Theories of Language [0.0]
This paper proposes a novel framework for understanding large language models (LLMs) I argue that LLMs should be understood as models of language itself, aligning with Jacques's concept of 'writing' (l'ecriture) I apply Saussure's critique of Saussure to position 'writing' as the object modeled by LLMs, offering a view of the machine's'mind' as a statistical approximation of sign behavior.
arXiv Detail & Related papers (2024-10-16T21:45:54Z)
A Concept-Based Explainability Framework for Large Multimodal Models [52.37626977572413]
We propose a dictionary learning based approach, applied to the representation of tokens.<n>We show that these concepts are well semantically grounded in both vision and text.<n>We show that the extracted multimodal concepts are useful to interpret representations of test samples.
arXiv Detail & Related papers (2024-06-12T10:48:53Z)
Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs [63.29737699997859]
Large Language Models (LLMs) have demonstrated impressive performance on multimodal tasks, without any multimodal finetuning. In this work, we expose frozen LLMs to image, video, audio and text inputs and analyse their internal representation.
arXiv Detail & Related papers (2024-05-26T21:31:59Z)
Deep de Finetti: Recovering Topic Distributions from Large Language Models [10.151434138893034]
Large language models (LLMs) can produce long, coherent passages of text. LLMs must represent the latent structure that characterizes a document. We investigate a complementary aspect, namely the document's topic structure.
arXiv Detail & Related papers (2023-12-21T16:44:39Z)
AlignedCoT: Prompting Large Language Models via Native-Speaking Demonstrations [52.43593893122206]
Alignedcot is an in-context learning technique for invoking Large Language Models. It achieves consistent and correct step-wise prompts in zero-shot scenarios. We conduct experiments on mathematical reasoning and commonsense reasoning.
arXiv Detail & Related papers (2023-11-22T17:24:21Z)
Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization [52.935150075484074]
We introduce a well-designed visual tokenizer to translate the non-linguistic image into a sequence of discrete tokens like a foreign language. The resulting visual tokens encompass high-level semantics worthy of a word and also support dynamic sequence length varying from the image. This unification empowers LaVIT to serve as an impressive generalist interface to understand and generate multi-modal content simultaneously.
arXiv Detail & Related papers (2023-09-09T03:01:38Z)
IERL: Interpretable Ensemble Representation Learning -- Combining CrowdSourced Knowledge and Distributed Semantic Representations [11.008412414253662]
Large Language Models (LLMs) encode meanings of words in the form of distributed semantics. Recent studies have shown that LLMs tend to generate unintended, inconsistent, or wrong texts as outputs. We propose a novel ensemble learning method, Interpretable Ensemble Representation Learning (IERL), that systematically combines LLM and crowdsourced knowledge representations.
arXiv Detail & Related papers (2023-06-24T05:02:34Z)
ThinkSum: Probabilistic reasoning over sets using large language models [18.123895485602244]
We propose a two-stage probabilistic inference paradigm, ThinkSum, which reasons over sets of objects or facts in a structured manner. We demonstrate the possibilities and advantages of ThinkSum on the BIG-bench suite of LLM evaluation tasks.
arXiv Detail & Related papers (2022-10-04T00:34:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.