Generalized Attention Flow: Feature Attribution for Transformer Models via Maximum Flow
- URL: http://arxiv.org/abs/2502.15765v1
- Date: Fri, 14 Feb 2025 19:50:58 GMT
- Title: Generalized Attention Flow: Feature Attribution for Transformer Models via Maximum Flow
- Authors: Behrooz Azarkhalili, Maxwell Libbrecht,
- Abstract summary: Generalized Attention Flow (GAF) is a novel feature attribution method for Transformer-based models.<n>GAF integrates attention weights, their gradients, the maximum flow problem, and the barrier method to enhance the performance of feature attributions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper introduces Generalized Attention Flow (GAF), a novel feature attribution method for Transformer-based models to address the limitations of current approaches. By extending Attention Flow and replacing attention weights with the generalized Information Tensor, GAF integrates attention weights, their gradients, the maximum flow problem, and the barrier method to enhance the performance of feature attributions. The proposed method exhibits key theoretical properties and mitigates the shortcomings of prior techniques that rely solely on simple aggregation of attention weights. Our comprehensive benchmarking on sequence classification tasks demonstrates that a specific variant of GAF consistently outperforms state-of-the-art feature attribution methods in most evaluation settings, providing a more reliable interpretation of Transformer model outputs.
Related papers
- HOFAR: High-Order Augmentation of Flow Autoregressive Transformers [17.002793355495136]
This paper introduces a novel framework that systematically enhances flow autoregressive transformers through high-order supervision.
We provide theoretical analysis and empirical evaluation showing that our High-Order FlowAR (HOFAR) demonstrates measurable improvements in generation quality compared to baseline models.
arXiv Detail & Related papers (2025-03-11T04:29:22Z) - Text-to-Image Rectified Flow as Plug-and-Play Priors [52.586838532560755]
Rectified flow is a novel class of generative models that enforces a linear progression from the source to the target distribution.
We show that rectified flow approaches surpass in terms of generation quality and efficiency, requiring fewer inference steps.
Our method also displays competitive performance in image inversion and editing.
arXiv Detail & Related papers (2024-06-05T14:02:31Z) - Guided Flows for Generative Modeling and Decision Making [55.42634941614435]
We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text synthesis-to-speech.
Notably, we are first to apply flow models for plan generation in the offline reinforcement learning setting ax speedup in compared to diffusion models.
arXiv Detail & Related papers (2023-11-22T15:07:59Z) - Boosting Summarization with Normalizing Flows and Aggressive Training [6.6242828769801285]
FlowSUM is a normalizing flows-based variational encoder-decoder framework for Transformer-based summarization.
Our approach tackles two primary challenges in variational summarization: insufficient semantic information in latent representations and posterior collapse during training.
arXiv Detail & Related papers (2023-11-01T15:33:38Z) - GAFlow: Incorporating Gaussian Attention into Optical Flow [62.646389181507764]
We push Gaussian Attention (GA) into the optical flow models to accentuate local properties during representation learning.
We introduce a novel Gaussian-Constrained Layer (GCL) which can be easily plugged into existing Transformer blocks.
For reliable motion analysis, we provide a new Gaussian-Guided Attention Module (GGAM)
arXiv Detail & Related papers (2023-09-28T07:46:01Z) - Flowformer: Linearizing Transformers with Conservation Flows [77.25101425464773]
We linearize Transformers free from specific inductive biases based on the flow network theory.
By respectively conserving the incoming flow of sinks for source competition and the outgoing flow of sources for sink allocation, Flow-Attention inherently generates informative attentions.
arXiv Detail & Related papers (2022-02-13T08:44:10Z) - GMFlow: Learning Optical Flow via Global Matching [124.57850500778277]
We propose a GMFlow framework for learning optical flow estimation.
It consists of three main components: a customized Transformer for feature enhancement, a correlation and softmax layer for global feature matching, and a self-attention layer for flow propagation.
Our new framework outperforms 32-iteration RAFT's performance on the challenging Sintel benchmark.
arXiv Detail & Related papers (2021-11-26T18:59:56Z) - Attentive Contractive Flow with Lipschitz-constrained Self-Attention [25.84621883831624]
We introduce a novel approach called Attentive Contractive Flow (ACF)
ACF utilizes a special category of flow-based generative models - contractive flows.
We demonstrate that ACF can be introduced into a variety of state of the art flow models in a plug-and-play manner.
arXiv Detail & Related papers (2021-09-24T18:02:49Z) - Generative Flows with Invertible Attentions [135.23766216657745]
We introduce two types of invertible attention mechanisms for generative flow models.
We exploit split-based attention mechanisms to learn the attention weights and input representations on every two splits of flow feature maps.
Our method provides invertible attention modules with tractable Jacobian determinants, enabling seamless integration of it at any positions of the flow-based models.
arXiv Detail & Related papers (2021-06-07T20:43:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.