Related papers: The Generalization Ridge: Information Flow in Natural Language Generation

The Generalization Ridge: Information Flow in Natural Language Generation

URL: http://arxiv.org/abs/2507.05387v1
Date: Mon, 07 Jul 2025 18:18:51 GMT
Title: The Generalization Ridge: Information Flow in Natural Language Generation
Authors: Ruidi Chang, Chunyuan Deng, Hanjie Chen,
Abstract summary: We show how predictive information peaks in upper-middle layers-forming a generalization ridge-before declining in final layers.<n>These findings offer new insights into the internal mechanisms of transformers and underscore the critical role of intermediate layers in supporting generalization.
Score: 7.756342860929851
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transformer-based language models have achieved state-of-the-art performance in natural language generation (NLG) tasks, yet their internal mechanisms for synthesizing task-relevant information remain insufficiently understood. While prior studies suggest that intermediate layers often yield more generalizable representations than final layers, how this generalization ability emerges and propagates across layers during training remains unclear. To address this gap, we propose InfoRidge, an information-theoretic framework, to characterize how predictive information-the mutual information between hidden representations and target outputs-varies across depth. Estimating this quantity enables us to trace the flow of task-relevant information throughout the model during training. Our experiments across various models and datasets reveal a consistent non-monotonic trend: predictive information peaks in upper-middle layers-forming a generalization ridge-before declining in final layers, reflecting a transition between generalization and memorization. To further investigate this phenomenon, we introduce residual scaling coefficients-trainable scalar parameters applied to each residual block-which serve as functional probes for assessing the relative importance of individual transformer layers. These coefficients reveal that, under distribution shift, models downweight final layers and increasingly rely on ridge layers, highlighting their role in generalization. Together, these findings offer new insights into the internal mechanisms of transformers and underscore the critical role of intermediate layers in supporting generalization.

Related papers

Neural Network Reprogrammability: A Unified Theme on Model Reprogramming, Prompt Tuning, and Prompt Instruction [55.914891182214475]
We introduce neural network reprogrammability as a unifying framework for model adaptation.<n>We present a taxonomy that categorizes such information manipulation approaches across four key dimensions.<n>We also analyze remaining technical challenges and ethical considerations.
arXiv Detail & Related papers (2025-06-05T05:42:27Z)
Bigger Isn't Always Memorizing: Early Stopping Overparameterized Diffusion Models [51.03144354630136]
Generalization in natural data domains is progressively achieved during training before the onset of memorization.<n>Generalization vs. memorization is then best understood as a competition between time scales.<n>We show that this phenomenology is recovered in diffusion models learning a simple probabilistic context-free grammar with random rules.
arXiv Detail & Related papers (2025-05-22T17:40:08Z)
Generalization Capability for Imitation Learning [1.30536490219656]
Imitation learning holds the promise of equipping robots with versatile skills by learning from expert demonstrations.<n>However, policies trained on finite datasets often struggle to generalize beyond the training distribution.<n>We present a unified perspective on the generalization capability of imitation learning, grounded in both information theorey and data distribution property.
arXiv Detail & Related papers (2025-04-25T17:59:59Z)
Layer by Layer: Uncovering Hidden Representations in Language Models [28.304269706993942]
We show that intermediate layers can encode even richer representations, often improving performance on a range of downstream tasks.<n>Our framework highlights how each layer balances information compression and signal preservation, revealing why mid-depth embeddings can exceed the last layer's performance.
arXiv Detail & Related papers (2025-02-04T05:03:42Z)
Does Representation Matter? Exploring Intermediate Layers in Large Language Models [22.704926222438456]
We investigate the quality of intermediate representations in large language models (LLMs)<n>We find that intermediate layers often yield more informative representations for downstream tasks than the final layers.<n>Our results illuminate the internal mechanics of LLMs and guide strategies for architectural optimization and training.
arXiv Detail & Related papers (2024-12-12T18:48:51Z)
Interpreting Affine Recurrence Learning in GPT-style Transformers [54.01174470722201]
In-context learning allows GPT-style transformers to generalize during inference without modifying their weights. This paper focuses specifically on their ability to learn and predict affine recurrences as an ICL task. We analyze the model's internal operations using both empirical and theoretical approaches.
arXiv Detail & Related papers (2024-10-22T21:30:01Z)
A separability-based approach to quantifying generalization: which layer is best? [0.0]
Generalization to unseen data remains poorly understood for deep learning classification and foundation models. We provide a new method for evaluating the capacity of networks to represent a sampled domain. We find that (i) high classification accuracy does not imply high generalizability; and (ii) deeper layers in a model do not always generalize the best.
arXiv Detail & Related papers (2024-05-02T17:54:35Z)
Prospector Heads: Generalized Feature Attribution for Large Models & Data [82.02696069543454]
We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods. We demonstrate how prospector heads enable improved interpretation and discovery of class-specific patterns in input data.
arXiv Detail & Related papers (2024-02-18T23:01:28Z)
VISIT: Visualizing and Interpreting the Semantic Information Flow of Transformers [45.42482446288144]
Recent advances in interpretability suggest we can project weights and hidden states of transformer-based language models to their vocabulary. We investigate LM attention heads and memory values, the vectors the models dynamically create and recall while processing a given input. We create a tool to visualize a forward pass of Generative Pre-trained Transformers (GPTs) as an interactive flow graph.
arXiv Detail & Related papers (2023-05-22T19:04:56Z)
Exploring Heterogeneous Characteristics of Layers in ASR Models for More Efficient Training [1.3999481573773072]
We study the stability of these layers across runs and model sizes. We propose that group normalization may be used without disrupting their formation. We apply these findings to Federated Learning in order to improve the training procedure.
arXiv Detail & Related papers (2021-10-08T17:25:19Z)
Transformer Feed-Forward Layers Are Key-Value Memories [49.52087581977751]
We show that feed-forward layers in transformer-based language models operate as key-value memories. We show that the learned patterns are human-interpretable, and that lower layers tend to capture shallow patterns, while upper layers learn more semantic ones.
arXiv Detail & Related papers (2020-12-29T19:12:05Z)
Deep Autoencoding Topic Model with Scalable Hybrid Bayesian Inference [55.35176938713946]
We develop deep autoencoding topic model (DATM) that uses a hierarchy of gamma distributions to construct its multi-stochastic-layer generative network. We propose a Weibull upward-downward variational encoder that deterministically propagates information upward via a deep neural network, followed by a downward generative model. The efficacy and scalability of our models are demonstrated on both unsupervised and supervised learning tasks on big corpora.
arXiv Detail & Related papers (2020-06-15T22:22:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.