Attention in Reasoning: Dataset, Analysis, and Modeling
- URL: http://arxiv.org/abs/2204.09774v1
- Date: Wed, 20 Apr 2022 20:32:31 GMT
- Title: Attention in Reasoning: Dataset, Analysis, and Modeling
- Authors: Shi Chen, Ming Jiang, Jinhui Yang and Qi Zhao
- Abstract summary: We propose an Attention with Reasoning capability (AiR) framework that uses attention to understand and improve the process leading to task outcomes.
We first define an evaluation metric based on a sequence of atomic reasoning operations, enabling a quantitative measurement of attention.
We then collect human eye-tracking and answer correctness data, and analyze various machine and human attention mechanisms on their reasoning capability.
- Score: 31.3104693230952
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While attention has been an increasingly popular component in deep neural
networks to both interpret and boost the performance of models, little work has
examined how attention progresses to accomplish a task and whether it is
reasonable. In this work, we propose an Attention with Reasoning capability
(AiR) framework that uses attention to understand and improve the process
leading to task outcomes. We first define an evaluation metric based on a
sequence of atomic reasoning operations, enabling a quantitative measurement of
attention that considers the reasoning process. We then collect human
eye-tracking and answer correctness data, and analyze various machine and human
attention mechanisms on their reasoning capability and how they impact task
performance. To improve the attention and reasoning ability of visual question
answering models, we propose to supervise the learning of attention
progressively along the reasoning process and to differentiate the correct and
incorrect attention patterns. We demonstrate the effectiveness of the proposed
framework in analyzing and modeling attention with better reasoning capability
and task performance. The code and data are available at
https://github.com/szzexpoi/AiR
Related papers
- PhD Thesis: Exploring the role of (self-)attention in cognitive and
computer vision architecture [0.0]
We analyze Transformer-based self-attention as a model and extend it with memory.
We propose GAMR, a cognitive architecture combining attention and memory, inspired by active vision theory.
arXiv Detail & Related papers (2023-06-26T12:40:12Z) - Guiding Visual Question Answering with Attention Priors [76.21671164766073]
We propose to guide the attention mechanism using explicit linguistic-visual grounding.
This grounding is derived by connecting structured linguistic concepts in the query to their referents among the visual objects.
The resultant algorithm is capable of probing attention-based reasoning models, injecting relevant associative knowledge, and regulating the core reasoning process.
arXiv Detail & Related papers (2022-05-25T09:53:47Z) - Attention cannot be an Explanation [99.37090317971312]
We ask how effective are attention based explanations in increasing human trust and reliance in the underlying models?
We perform extensive human study experiments that aim to qualitatively and quantitatively assess the degree to which attention based explanations are suitable.
Our experiment results show that attention cannot be used as an explanation.
arXiv Detail & Related papers (2022-01-26T21:34:05Z) - Alignment Attention by Matching Key and Query Distributions [48.93793773929006]
This paper introduces alignment attention that explicitly encourages self-attention to match the distributions of the key and query within each head.
It is simple to convert any models with self-attention, including pre-trained ones, to the proposed alignment attention.
On a variety of language understanding tasks, we show the effectiveness of our method in accuracy, uncertainty estimation, generalization across domains, and robustness to adversarial attacks.
arXiv Detail & Related papers (2021-10-25T00:54:57Z) - Counterfactual Attention Learning for Fine-Grained Visual Categorization
and Re-identification [101.49122450005869]
We present a counterfactual attention learning method to learn more effective attention based on causal inference.
Specifically, we analyze the effect of the learned visual attention on network prediction.
We evaluate our method on a wide range of fine-grained recognition tasks.
arXiv Detail & Related papers (2021-08-19T14:53:40Z) - Understanding top-down attention using task-oriented ablation design [0.22940141855172028]
Top-down attention allows neural networks, both artificial and biological, to focus on the information most relevant for a given task.
We aim to answer this with a computational experiment based on a general framework called task-oriented ablation design.
We compare the performance of two neural networks, one with top-down attention and one without.
arXiv Detail & Related papers (2021-06-08T21:01:47Z) - SparseBERT: Rethinking the Importance Analysis in Self-attention [107.68072039537311]
Transformer-based models are popular for natural language processing (NLP) tasks due to its powerful capacity.
Attention map visualization of a pre-trained model is one direct method for understanding self-attention mechanism.
We propose a Differentiable Attention Mask (DAM) algorithm, which can be also applied in guidance of SparseBERT design.
arXiv Detail & Related papers (2021-02-25T14:13:44Z) - AiR: Attention with Reasoning Capability [31.3104693230952]
We propose an Attention with Reasoning capability (AiR) framework that uses attention to understand and improve the process leading to task outcomes.
We first define an evaluation metric based on a sequence of atomic reasoning operations, enabling quantitative measurement of attention that considers the reasoning process.
We then collect human eye-tracking and answer correctness data, and analyze various machine and human attentions on their reasoning capability and how they impact task performance.
arXiv Detail & Related papers (2020-07-28T18:09:45Z) - Cost-effective Interactive Attention Learning with Neural Attention
Processes [79.8115563067513]
We propose a novel interactive learning framework which we refer to as Interactive Attention Learning (IAL)
IAL is prone to overfitting due to scarcity of human annotations, and requires costly retraining.
We tackle these challenges by proposing a sample-efficient attention mechanism and a cost-effective reranking algorithm for instances and features.
arXiv Detail & Related papers (2020-06-09T17:36:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.