Related papers: Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space

Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space

URL: http://arxiv.org/abs/2203.14680v1
Date: Mon, 28 Mar 2022 12:26:00 GMT
Title: Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space
Authors: Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg
Abstract summary: Transformer-based language models (LMs) are at the core of modern NLP, but their internal prediction construction process is opaque and largely not understood. We make a substantial step towards unveiling this underlying prediction process, by reverse-engineering the operation of the feed-forward network (FFN) layers.
Score: 49.029910567673824
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transformer-based language models (LMs) are at the core of modern NLP, but their internal prediction construction process is opaque and largely not understood. In this work, we make a substantial step towards unveiling this underlying prediction process, by reverse-engineering the operation of the feed-forward network (FFN) layers, one of the building blocks of transformer models. We view the token representation as a changing distribution over the vocabulary, and the output from each FFN layer as an additive update to that distribution. Then, we analyze the FFN updates in the vocabulary space, showing that each update can be decomposed to sub-updates corresponding to single FFN parameter vectors, each promoting concepts that are often human-interpretable. We then leverage these findings for controlling LM predictions, where we reduce the toxicity of GPT2 by almost 50%, and for improving computation efficiency with a simple early exit rule, saving 20% of computation on average.

Related papers

Partially Rewriting a Transformer in Natural Language [0.7234862895932991]
We attempt to partially rewrite a large language model using simple natural language explanations. We replace the first layer of this sparse with an LLM-based simulator, which predicts the activation of each neuron. We measure the degree to which these modifications distort the model's final output.
arXiv Detail & Related papers (2025-01-31T01:12:50Z)
Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts [33.677249894085186]
We explore the potential of Concept Bottleneck Models for transforming complex, non-interpretable foundation models into interpretable decision-making pipelines. Specifically, we focus on the test-time deployment of such an interpretable CBM pipeline "in the wild" Our adaptation method produces concept-based interpretations better aligned with the test data and boosts post-deployment accuracy by up to 28%.
arXiv Detail & Related papers (2024-12-18T17:47:46Z)
Making the Most of your Model: Methods for Finetuning and Applying Pretrained Transformers [0.21756081703276003]
This thesis provides methods and analysis of models which make progress on this goal. We introduce two new finetuning methods which add new capabilities to the models they are used on. We provide theoretical and empirical insights on the divergence of model-likelihood and output quality.
arXiv Detail & Related papers (2024-08-29T03:50:24Z)
Efficient Point Transformer with Dynamic Token Aggregating for Point Cloud Processing [19.73918716354272]
We propose an efficient point TransFormer with Dynamic Token Aggregating (DTA-Former) for point cloud representation and processing. It achieves SOTA performance with up to 30$times$ faster than prior point Transformers on ModelNet40, ShapeNet, and airborne MultiSpectral LiDAR (MS-LiDAR) datasets.
arXiv Detail & Related papers (2024-05-23T20:50:50Z)
Uncovering mesa-optimization algorithms in Transformers [61.06055590704677]
Some autoregressive models can learn as an input sequence is processed, without undergoing any parameter changes, and without being explicitly trained to do so. We show that standard next-token prediction error minimization gives rise to a subsidiary learning algorithm that adjusts the model as new inputs are revealed. Our findings explain in-context learning as a product of autoregressive loss minimization and inform the design of new optimization-based Transformer layers.
arXiv Detail & Related papers (2023-09-11T22:42:50Z)
Optimizing Non-Autoregressive Transformers with Contrastive Learning [74.46714706658517]
Non-autoregressive Transformers (NATs) reduce the inference latency of Autoregressive Transformers (ATs) by predicting words all at once rather than in sequential order. In this paper, we propose to ease the difficulty of modality learning via sampling from the model distribution instead of the data distribution.
arXiv Detail & Related papers (2023-05-23T04:20:13Z)
Jump to Conclusions: Short-Cutting Transformers With Linear Transformations [60.37563766047492]
Transformer-based language models create hidden representations of their inputs at every layer, but only use final-layer representations for prediction. This obscures the internal decision-making process of the model and the utility of its intermediate representations. We suggest a simple method for such casting, using linear transformations.
arXiv Detail & Related papers (2023-03-16T16:10:16Z)
Towards Opening the Black Box of Neural Machine Translation: Source and Target Interpretations of the Transformer [1.8594711725515678]
In Neural Machine Translation (NMT), each token prediction is conditioned on the source sentence and the target prefix. Previous work on interpretability in NMT has focused solely on source sentence tokens attributions. We propose an interpretability method that tracks complete input token attributions.
arXiv Detail & Related papers (2022-05-23T20:59:14Z)
Transkimmer: Transformer Learns to Layer-wise Skim [17.188613474427054]
One of the major computational inefficiency of Transformer-based models is that they spend identical amount of computation throughout all layers. We propose Transkimmer architecture, which learns to identify hidden state tokens that are not required by each layer. The skimmed tokens are then forwarded directly to the final output, thus reducing the computation of the successive layers.
arXiv Detail & Related papers (2022-05-15T16:23:30Z)
Consistent Accelerated Inference via Confident Adaptive Transformers [29.034390810078172]
We develop a novel approach for confidently accelerating inference in the large and expensive multilayer Transformers. We simultaneously increase computational efficiency, while guaranteeing a specifiable degree of consistency with the original model with high confidence. We demonstrate the effectiveness of this approach on four classification and regression tasks.
arXiv Detail & Related papers (2021-04-18T10:22:28Z)
Bayesian Transformer Language Models for Speech Recognition [59.235405107295655]
State-of-the-art neural language models (LMs) represented by Transformers are highly complex. This paper proposes a full Bayesian learning framework for Transformer LM estimation.
arXiv Detail & Related papers (2021-02-09T10:55:27Z)
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing [112.2208052057002]
We propose Funnel-Transformer which gradually compresses the sequence of hidden states to a shorter one. With comparable or fewer FLOPs, Funnel-Transformer outperforms the standard Transformer on a wide variety of sequence-level prediction tasks.
arXiv Detail & Related papers (2020-06-05T05:16:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.