Related papers: InTraVisTo: Inside Transformer Visualisation Tool

InTraVisTo: Inside Transformer Visualisation Tool

URL: http://arxiv.org/abs/2507.13858v1
Date: Fri, 18 Jul 2025 12:23:47 GMT
Title: InTraVisTo: Inside Transformer Visualisation Tool
Authors: Nicolò Brunello, Davide Rigamonti, Andrea Sassella, Vincenzo Scotti, Mark James Carman,
Abstract summary: In this paper, we introduce a new tool, InTraVisTo, designed to enable researchers to investigate and trace the computational process that generates each token in a Transformer-based LLM.<n>InTraVisTo provides a visualization of both the internal state of the Transformer model and the information flow between the various components across the different layers of the model.
Score: 0.19573380763700712
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The reasoning capabilities of Large Language Models (LLMs) have increased greatly over the last few years, as have their size and complexity. Nonetheless, the use of LLMs in production remains challenging due to their unpredictable nature and discrepancies that can exist between their desired behavior and their actual model output. In this paper, we introduce a new tool, InTraVisTo (Inside Transformer Visualisation Tool), designed to enable researchers to investigate and trace the computational process that generates each token in a Transformer-based LLM. InTraVisTo provides a visualization of both the internal state of the Transformer model (by decoding token embeddings at each layer of the model) and the information flow between the various components across the different layers of the model (using a Sankey diagram). With InTraVisTo, we aim to help researchers and practitioners better understand the computations being performed within the Transformer model and thus to shed some light on internal patterns and reasoning processes employed by LLMs.

Related papers

The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer [68.71557348281007]
This paper introduces SAIL, a single transformer unified multimodal large language model (MLLM)<n>Unlike existing modular MLLMs, which rely on a pre-trained vision transformer (ViT), SAIL eliminates the need for a separate vision encoder.<n>We systematically compare SAIL's properties-including scalability, cross-modal information flow patterns, and visual representation capabilities-with those of modular MLLMs.
arXiv Detail & Related papers (2025-04-14T17:50:20Z)
A Survey on Large Language Models from Concept to Implementation [4.219910716090213]
Recent advancements in Large Language Models (LLMs) have broadened the scope of natural language processing (NLP) applications. This paper investigates the multifaceted applications of these models, with an emphasis on the GPT series. This exploration focuses on the transformative impact of artificial intelligence (AI) driven tools in revolutionizing traditional tasks like coding and problem-solving.
arXiv Detail & Related papers (2024-03-27T19:35:41Z)
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey [18.930417261395906]
Transformer-based Large Language Models (LLMs) have been applied in diverse areas such as knowledge bases, human interfaces, and dynamic agents. This article offers a survey of the recent advancement in Transformer-based LLM architectures aimed at enhancing the long-context capabilities of LLMs.
arXiv Detail & Related papers (2023-11-21T04:59:17Z)
Understanding Addition in Transformers [2.07180164747172]
This paper provides a comprehensive analysis of a one-layer Transformer model trained to perform n-digit integer addition. Our findings suggest that the model dissects the task into parallel streams dedicated to individual digits, employing varied algorithms tailored to different positions within the digits.
arXiv Detail & Related papers (2023-10-19T19:34:42Z)
Holistically Explainable Vision Transformers [136.27303006772294]
We propose B-cos transformers, which inherently provide holistic explanations for their decisions. Specifically, we formulate each model component - such as the multi-layer perceptrons, attention layers, and the tokenisation module - to be dynamic linear. We apply our proposed design to Vision Transformers (ViTs) and show that the resulting models, dubbed Bcos-ViTs, are highly interpretable and perform competitively to baseline ViTs.
arXiv Detail & Related papers (2023-01-20T16:45:34Z)
VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers [47.581265194864585]
Internal mechanisms of vision and multimodal transformers remain largely opaque. With the success of these transformers, it is increasingly critical to understand their inner workings. We propose VL-InterpreT, which provides novel interactive visualizations for interpreting the attentions and hidden representations in multimodal transformers.
arXiv Detail & Related papers (2022-03-30T05:25:35Z)
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias [76.16156833138038]
We propose a novel Vision Transformer Advanced by Exploring intrinsic IB from convolutions, ie, ViTAE. ViTAE has several spatial pyramid reduction modules to downsample and embed the input image into tokens with rich multi-scale context. In each transformer layer, ViTAE has a convolution block in parallel to the multi-head self-attention module, whose features are fused and fed into the feed-forward network.
arXiv Detail & Related papers (2021-06-07T05:31:06Z)
Visformer: The Vision-friendly Transformer [105.52122194322592]
We propose a new architecture named Visformer, which is abbreviated from the Vision-friendly Transformer' With the same computational complexity, Visformer outperforms both the Transformer-based and convolution-based models in terms of ImageNet classification accuracy.
arXiv Detail & Related papers (2021-04-26T13:13:03Z)
VisBERT: Hidden-State Visualizations for Transformers [66.86452388524886]
We present VisBERT, a tool for visualizing the contextual token representations within BERT for the task of (multi-hop) Question Answering. VisBERT enables users to get insights about the model's internal state and to explore its inference steps or potential shortcomings.
arXiv Detail & Related papers (2020-11-09T15:37:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.