Related papers: Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze?

Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze?

URL: http://arxiv.org/abs/2205.10226v1
Date: Mon, 25 Apr 2022 08:23:13 GMT
Title: Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze?
Authors: Stephanie Brandl, Oliver Eberle, Jonas Pilot, Anders S{\o}gaard
Abstract summary: Self-attention functions in state-of-the-art NLP models often correlate with human attention. We investigate whether self-attention in large-scale pre-trained language models is as predictive of human eye fixation patterns during task-reading as classical cognitive models of human attention.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learned self-attention functions in state-of-the-art NLP models often correlate with human attention. We investigate whether self-attention in large-scale pre-trained language models is as predictive of human eye fixation patterns during task-reading as classical cognitive models of human attention. We compare attention functions across two task-specific reading datasets for sentiment analysis and relation extraction. We find the predictiveness of large-scale pre-trained self-attention for human attention depends on `what is in the tail', e.g., the syntactic nature of rare contexts. Further, we observe that task-specific fine-tuning does not increase the correlation with human task-specific reading. Through an input reduction experiment we give complementary insights on the sparsity and fidelity trade-off, showing that lower-entropy attention vectors are more faithful.

Related papers

Vision Transformer attention alignment with human visual perception in aesthetic object evaluation [0.0]
Visual attention mechanisms play a crucial role in human perception and aesthetic evaluation.<n>Recent advances in Vision Transformers (ViTs) have demonstrated remarkable capabilities in computer vision tasks.<n>This study investigates the correlation between human visual attention and ViT attention mechanisms when evaluating handcrafted objects.
arXiv Detail & Related papers (2025-07-23T15:47:34Z)
Using Attention Sinks to Identify and Evaluate Dormant Heads in Pretrained LLMs [77.43913758420948]
We propose a new definition for attention heads dominated by attention sinks, known as dormant attention heads. More than 4% of a model's attention heads can be zeroed while maintaining average accuracy. dormant heads emerge early in pretraining and can transition between dormant and active states during pretraining.
arXiv Detail & Related papers (2025-04-04T19:28:23Z)
Testing the Limits of Fine-Tuning for Improving Visual Cognition in Vision Language Models [51.58859621164201]
We introduce visual stimuli and human judgments on visual cognition tasks to evaluate performance across cognitive domains.<n>We fine-tune models on ground truth data for intuitive physics and causal reasoning.<n>We find that task-specific fine-tuning does not contribute to robust human-like generalization to data with other visual characteristics.
arXiv Detail & Related papers (2025-02-21T18:58:30Z)
Look Hear: Gaze Prediction for Speech-directed Human Attention [49.81718760025951]
Our study focuses on the incremental prediction of attention as a person is seeing an image and hearing a referring expression. We developed the Attention in Referral Transformer model or ART, which predicts the human fixations spurred by each word in a referring expression. In our quantitative and qualitative analyses, ART not only outperforms existing methods in scanpath prediction, but also appears to capture several human attention patterns.
arXiv Detail & Related papers (2024-07-28T22:35:08Z)
Learning from Observer Gaze:Zero-Shot Attention Prediction Oriented by Human-Object Interaction Recognition [13.956664101032006]
We first collect a novel gaze fixation dataset named IG, comprising 530,000 fixation points across 740 diverse interaction categories. We then introduce the zero-shot interaction-oriented attention prediction task ZeroIA, which challenges models to predict visual cues for interactions not encountered during training. Thirdly, we present the Interactive Attention model IA, designed to emulate human observers cognitive processes to tackle the ZeroIA problem.
arXiv Detail & Related papers (2024-05-16T09:34:57Z)
Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption [64.07607726562841]
Existing multi-person human reconstruction approaches mainly focus on recovering accurate poses or avoiding penetration. In this work, we tackle the task of reconstructing closely interactive humans from a monocular video. We propose to leverage knowledge from proxemic behavior and physics to compensate the lack of visual information.
arXiv Detail & Related papers (2024-04-17T11:55:45Z)
Seeing Eye to AI: Comparing Human Gaze and Model Attention in Video Memorability [21.44002657362493]
We adopt a simple CNN+Transformer architecture that enables analysis of features while fuse-temporal attention matching state-of-the-art (TASo) performance on video memorability prediction. We compare model attention against human fixations through a small-scale eye-tracking study where humans perform a memory memory task.
arXiv Detail & Related papers (2023-11-26T05:14:06Z)
Attention cannot be an Explanation [99.37090317971312]
We ask how effective are attention based explanations in increasing human trust and reliance in the underlying models? We perform extensive human study experiments that aim to qualitatively and quantitatively assess the degree to which attention based explanations are suitable. Our experiment results show that attention cannot be used as an explanation.
arXiv Detail & Related papers (2022-01-26T21:34:05Z)
Understanding top-down attention using task-oriented ablation design [0.22940141855172028]
Top-down attention allows neural networks, both artificial and biological, to focus on the information most relevant for a given task. We aim to answer this with a computational experiment based on a general framework called task-oriented ablation design. We compare the performance of two neural networks, one with top-down attention and one without.
arXiv Detail & Related papers (2021-06-08T21:01:47Z)
Is Sparse Attention more Interpretable? [52.85910570651047]
We investigate how sparsity affects our ability to use attention as an explainability tool. We find that only a weak relationship between inputs and co-indexed intermediate representations exists -- under sparse attention. We observe in this setting that inducing sparsity may make it less plausible that attention can be used as a tool for understanding model behavior.
arXiv Detail & Related papers (2021-06-02T11:42:56Z)
Gaze Perception in Humans and CNN-Based Model [66.89451296340809]
We compare how a CNN (convolutional neural network) based model of gaze and humans infer the locus of attention in images of real-world scenes. We show that compared to the model, humans' estimates of the locus of attention are more influenced by the context of the scene.
arXiv Detail & Related papers (2021-04-17T04:52:46Z)
SparseBERT: Rethinking the Importance Analysis in Self-attention [107.68072039537311]
Transformer-based models are popular for natural language processing (NLP) tasks due to its powerful capacity. Attention map visualization of a pre-trained model is one direct method for understanding self-attention mechanism. We propose a Differentiable Attention Mask (DAM) algorithm, which can be also applied in guidance of SparseBERT design.
arXiv Detail & Related papers (2021-02-25T14:13:44Z)
Attention Flows: Analyzing and Comparing Attention Mechanisms in Language Models [5.866941279460248]
We propose a visual analytics approach to understanding fine-tuning in attention-based language models. Our visualization, Attention Flows, is designed to support users in querying, tracing, and comparing attention within layers, across layers, and amongst attention heads in Transformer-based language models.
arXiv Detail & Related papers (2020-09-03T19:56:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.