Related papers: States Hidden in Hidden States: LLMs Emerge Discrete State Representations Implicitly

States Hidden in Hidden States: LLMs Emerge Discrete State Representations Implicitly

URL: http://arxiv.org/abs/2407.11421v1
Date: Tue, 16 Jul 2024 06:27:22 GMT
Title: States Hidden in Hidden States: LLMs Emerge Discrete State Representations Implicitly
Authors: Junhao Chen, Shengding Hu, Zhiyuan Liu, Maosong Sun,
Abstract summary: In this paper, we uncover the intrinsic ability to perform extended sequences of calculations without relying on chain-of-thought step-by-step solutions. Remarkably, the most advanced models can directly output the results of two-digit number additions with lengths extending up to 15 addends.
Score: 72.24742240125369
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) exhibit various emergent abilities. Among these abilities, some might reveal the internal working mechanisms of models. In this paper, we uncover a novel emergent capability in models: the intrinsic ability to perform extended sequences of calculations without relying on chain-of-thought step-by-step solutions. Remarkably, the most advanced models can directly output the results of two-digit number additions with lengths extending up to 15 addends. We hypothesize that the model emerges Implicit Discrete State Representations (IDSRs) within its hidden states and performs symbolic calculations internally. To test this hypothesis, we design a sequence of experiments that look into the hidden states. Specifically, we first confirm that IDSRs exist. Then, we provide interesting observations about the formation of IDSRs from layer, digit, and sequence perspectives. Finally, we confirm that models indeed use IDSRs to produce the final answers. However, we also discover that these state representations are far from lossless in current open-sourced models, leading to inaccuracies in their final performance. Our work presents a novel exploration of LLMs' symbolic calculation abilities and the underlying mechanisms.

Related papers

I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders [8.1201445044499]
Large Language Models (LLMs) have achieved remarkable success in natural language processing. Recent advances have led to the developing of a new class of reasoning LLMs. Open-source DeepSeek-R1 has achieved state-of-the-art performance by integrating deep thinking and complex reasoning.
arXiv Detail & Related papers (2025-03-24T16:54:26Z)
I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data? [79.01538178959726]
Large language models (LLMs) have led many to conclude that they exhibit a form of intelligence. We introduce a novel generative model that generates tokens on the basis of human interpretable concepts represented as latent discrete variables.
arXiv Detail & Related papers (2025-03-12T01:21:17Z)
SeRpEnt: Selective Resampling for Expressive State Space Models [5.7918134313332414]
State Space Models (SSMs) have recently enjoyed a rise to prominence in the field of deep learning for sequence modeling. We show how selective time intervals in Mamba act as linear approximators of information. We propose our SeRpEnt architecture, a SSM that further exploits selectivity to compress sequences in an information-aware fashion.
arXiv Detail & Related papers (2025-01-20T20:27:50Z)
Transformers Use Causal World Models in Maze-Solving Tasks [49.67445252528868]
We identify World Models in transformers trained on maze-solving tasks. We find that it is easier to activate features than to suppress them. positional encoding schemes appear to influence how World Models are structured within the model's residual stream.
arXiv Detail & Related papers (2024-12-16T15:21:04Z)
Exploring Diverse Representations for Open Set Recognition [51.39557024591446]
Open set recognition (OSR) requires the model to classify samples that belong to closed sets while rejecting unknown samples during test. Currently, generative models often perform better than discriminative models in OSR. We propose a new model, namely Multi-Expert Diverse Attention Fusion (MEDAF), that learns diverse representations in a discriminative way.
arXiv Detail & Related papers (2024-01-12T11:40:22Z)
Emergence of Abstract State Representations in Embodied Sequence Modeling [24.827284626429964]
Sequence modeling aims to mimic the success of language models, where actions are modeled as tokens to predict. We show that environmental layouts can be reasonably reconstructed from the internal activations of a trained model. Our results support an optimistic outlook for applications of sequence modeling objectives to more complex embodied decision-making domains.
arXiv Detail & Related papers (2023-11-03T18:00:59Z)
Recurrent Neural Language Models as Probabilistic Finite-state Automata [66.23172872811594]
We study what classes of probability distributions RNN LMs can represent. We show that simple RNNs are equivalent to a subclass of probabilistic finite-state automata. These results present a first step towards characterizing the classes of distributions RNN LMs can represent.
arXiv Detail & Related papers (2023-10-08T13:36:05Z)
Structured Thoughts Automaton: First Formalized Execution Model for Auto-Regressive Language Models [0.0]
We introduce a new algorithm for sampling the predictions of LMs, which we use to build a reliable and inspectable execution model. We introduce a low-level language to write "cognitive program" for this execution model.
arXiv Detail & Related papers (2023-06-16T22:04:50Z)
Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task [75.35278593566068]
Language models show a surprising range of capabilities, but the source of their apparent competence is unclear. Do these networks just memorize a collection of surface statistics, or do they rely on internal representations of the process that generates the sequences they see? We investigate this question by applying a variant of the GPT model to the task of predicting legal moves in a simple board game, Othello.
arXiv Detail & Related papers (2022-10-24T16:29:55Z)
Linear-Time Verification of Data-Aware Dynamic Systems with Arithmetic [8.914271888521652]
We introduce a new semantic property of "finite summary", which guarantees the existence of a faithful finite-state abstraction. Several decidability conditions studied in formal methods and database theory can be seen as concrete, checkable instances of this property. Our results allow us to analyze systems that were out of reach in earlier approaches.
arXiv Detail & Related papers (2022-03-15T15:14:25Z)
High Fidelity Visualization of What Your Self-Supervised Representation Knows About [22.982471878833362]
In this work, we showcase the use of a conditional diffusion based generative model (RCDM) to visualize representations learned with self-supervised models. We demonstrate how this model's generation quality is on par with state-of-the-art generative models while being faithful to the representation used as conditioning.
arXiv Detail & Related papers (2021-12-16T19:23:33Z)
Self-Supervised Models are Continual Learners [79.70541692930108]
We show that self-supervised loss functions can be seamlessly converted into distillation mechanisms for Continual Learning. We devise a framework for Continual self-supervised visual representation Learning that significantly improves the quality of the learned representations.
arXiv Detail & Related papers (2021-12-08T10:39:13Z)
Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers. We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.