Related papers: Probing Information Distribution in Transformer Architectures through Entropy Analysis

Probing Information Distribution in Transformer Architectures through Entropy Analysis

URL: http://arxiv.org/abs/2507.15347v1
Date: Mon, 21 Jul 2025 08:01:22 GMT
Title: Probing Information Distribution in Transformer Architectures through Entropy Analysis
Authors: Amedeo Buonanno, Alessandro Rivetti, Francesco A. N. Palmieri, Giovanni Di Gennaro, Gianmarco Romano,
Abstract summary: This work explores entropy analysis as a tool for probing information distribution within Transformer-based architectures.<n>We apply the methodology to a GPT-based large language model, illustrating its potential to reveal insights into model behavior and internal representations.
Score: 39.58317527488534
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: This work explores entropy analysis as a tool for probing information distribution within Transformer-based architectures. By quantifying token-level uncertainty and examining entropy patterns across different stages of processing, we aim to investigate how information is managed and transformed within these models. As a case study, we apply the methodology to a GPT-based large language model, illustrating its potential to reveal insights into model behavior and internal representations. This approach may offer insights into model behavior and contribute to the development of interpretability and evaluation frameworks for transformer-based models

Related papers

Foundation Models and Transformers for Anomaly Detection: A Survey [2.3264194695971656]
The survey categorizes VAD methods into reconstruction-based, feature-based and zero/few-shot approaches.<n>Transformers and foundation models enable more robust, interpretable, and scalable anomaly detection solutions.
arXiv Detail & Related papers (2025-07-21T12:01:04Z)
Entropy-Lens: The Information Signature of Transformer Computations [14.613982627206884]
We introduce Entropy-Lens, a model-agnostic framework to interpret frozen, off-the-shelf large-scale transformers.<n>Our results suggest that entropy-based metrics can serve as a principled tool for unveiling the inner workings of modern transformer architectures.
arXiv Detail & Related papers (2025-02-23T13:33:27Z)
A Survey of Model Architectures in Information Retrieval [64.75808744228067]
We focus on two key aspects: backbone models for feature extraction and end-to-end system architectures for relevance estimation.<n>We trace the development from traditional term-based methods to modern neural approaches, particularly highlighting the impact of transformer-based models and subsequent large language models (LLMs)<n>We conclude by discussing emerging challenges and future directions, including architectural optimizations for performance and scalability, handling of multimodal, multilingual data, and adaptation to novel application domains beyond traditional search paradigms.
arXiv Detail & Related papers (2025-02-20T18:42:58Z)
Mechanistic Unveiling of Transformer Circuits: Self-Influence as a Key to Model Reasoning [9.795934690403374]
It is still unclear which multi-step reasoning mechanisms are used by language models to solve such tasks.<n>We employ circuit analysis and self-influence functions to evaluate the changing importance of each token throughout the reasoning process.<n>We demonstrate that the underlying circuits reveal a human-interpretable reasoning process used by the model.
arXiv Detail & Related papers (2025-02-13T07:19:05Z)
Transformers Use Causal World Models in Maze-Solving Tasks [49.67445252528868]
We identify World Models in transformers trained on maze-solving tasks.<n>We find that it is easier to activate features than to suppress them.<n> positional encoding schemes appear to influence how World Models are structured within the model's residual stream.
arXiv Detail & Related papers (2024-12-16T15:21:04Z)
Analyzing Deep Transformer Models for Time Series Forecasting via Manifold Learning [4.910937238451485]
Transformer models have consistently achieved remarkable results in various domains such as natural language processing and computer vision. Despite ongoing research efforts to better understand these models, the field still lacks a comprehensive understanding. Time series data, unlike image and text information, can be more challenging to interpret and analyze.
arXiv Detail & Related papers (2024-10-17T17:32:35Z)
Enforcing Interpretability in Time Series Transformers: A Concept Bottleneck Framework [2.8470354623829577]
We develop a framework based on Concept Bottleneck Models to enforce interpretability of time series Transformers. We modify the training objective to encourage a model to develop representations similar to predefined interpretable concepts. We find that the model performance remains mostly unaffected, while the model shows much improved interpretability.
arXiv Detail & Related papers (2024-10-08T14:22:40Z)
Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance [61.06245197347139]
We propose a novel approach to explain the behavior of a black-box model under feature shifts. We refer to our method that combines concepts from Optimal Transport and Shapley Values as Explanatory Performance Estimation.
arXiv Detail & Related papers (2024-08-24T18:28:19Z)
Visual Analytics for Generative Transformer Models [28.251218916955125]
We present a novel visual analytical framework to support the analysis of transformer-based generative networks. Our framework is one of the first dedicated to supporting the analysis of transformer-based encoder-decoder models.
arXiv Detail & Related papers (2023-11-21T08:15:01Z)
Understanding Addition in Transformers [2.07180164747172]
This paper provides a comprehensive analysis of a one-layer Transformer model trained to perform n-digit integer addition. Our findings suggest that the model dissects the task into parallel streams dedicated to individual digits, employing varied algorithms tailored to different positions within the digits.
arXiv Detail & Related papers (2023-10-19T19:34:42Z)
Explainability in Process Outcome Prediction: Guidelines to Obtain Interpretable and Faithful Models [77.34726150561087]
We define explainability through the interpretability of the explanations and the faithfulness of the explainability model in the field of process outcome prediction. This paper contributes a set of guidelines named X-MOP which allows selecting the appropriate model based on the event log specifications.
arXiv Detail & Related papers (2022-03-30T05:59:50Z)
Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task. The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them. By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.