Counterfactual Attention Learning for Fine-Grained Visual Categorization
and Re-identification
- URL: http://arxiv.org/abs/2108.08728v1
- Date: Thu, 19 Aug 2021 14:53:40 GMT
- Title: Counterfactual Attention Learning for Fine-Grained Visual Categorization
and Re-identification
- Authors: Yongming Rao, Guangyi Chen, Jiwen Lu, Jie Zhou
- Abstract summary: We present a counterfactual attention learning method to learn more effective attention based on causal inference.
Specifically, we analyze the effect of the learned visual attention on network prediction.
We evaluate our method on a wide range of fine-grained recognition tasks.
- Score: 101.49122450005869
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Attention mechanism has demonstrated great potential in fine-grained visual
recognition tasks. In this paper, we present a counterfactual attention
learning method to learn more effective attention based on causal inference.
Unlike most existing methods that learn visual attention based on conventional
likelihood, we propose to learn the attention with counterfactual causality,
which provides a tool to measure the attention quality and a powerful
supervisory signal to guide the learning process. Specifically, we analyze the
effect of the learned visual attention on network prediction through
counterfactual intervention and maximize the effect to encourage the network to
learn more useful attention for fine-grained image recognition. Empirically, we
evaluate our method on a wide range of fine-grained recognition tasks where
attention plays a crucial role, including fine-grained image categorization,
person re-identification, and vehicle re-identification. The consistent
improvement on all benchmarks demonstrates the effectiveness of our method.
Code is available at https://github.com/raoyongming/CAL
Related papers
- Perturbation-based Self-supervised Attention for Attention Bias in Text
Classification [31.144857032681905]
We propose a perturbation-based self-supervised attention approach to guide attention learning.
We add as much noise as possible to all the words in the sentence without changing their semantics and predictions.
Experimental results on three text classification tasks show that our approach can significantly improve the performance of current attention-based models.
arXiv Detail & Related papers (2023-05-25T03:18:18Z) - Dual Cross-Attention Learning for Fine-Grained Visual Categorization and
Object Re-Identification [19.957957963417414]
We propose a dual cross-attention learning (DCAL) algorithm to coordinate with self-attention learning.
First, we propose global-local cross-attention (GLCA) to enhance the interactions between global images and local high-response regions.
Second, we propose pair-wise cross-attention (PWCA) to establish the interactions between image pairs.
arXiv Detail & Related papers (2022-05-04T16:14:26Z) - Attention in Reasoning: Dataset, Analysis, and Modeling [31.3104693230952]
We propose an Attention with Reasoning capability (AiR) framework that uses attention to understand and improve the process leading to task outcomes.
We first define an evaluation metric based on a sequence of atomic reasoning operations, enabling a quantitative measurement of attention.
We then collect human eye-tracking and answer correctness data, and analyze various machine and human attention mechanisms on their reasoning capability.
arXiv Detail & Related papers (2022-04-20T20:32:31Z) - Attention Mechanisms in Computer Vision: A Survey [75.6074182122423]
We provide a comprehensive review of various attention mechanisms in computer vision.
We categorize them according to approach, such as channel attention, spatial attention, temporal attention and branch attention.
We suggest future directions for attention mechanism research.
arXiv Detail & Related papers (2021-11-15T09:18:40Z) - Alignment Attention by Matching Key and Query Distributions [48.93793773929006]
This paper introduces alignment attention that explicitly encourages self-attention to match the distributions of the key and query within each head.
It is simple to convert any models with self-attention, including pre-trained ones, to the proposed alignment attention.
On a variety of language understanding tasks, we show the effectiveness of our method in accuracy, uncertainty estimation, generalization across domains, and robustness to adversarial attacks.
arXiv Detail & Related papers (2021-10-25T00:54:57Z) - Unlocking Pixels for Reinforcement Learning via Implicit Attention [61.666538764049854]
We make use of new efficient attention algorithms, recently shown to be highly effective for Transformers.
This allows our attention-based controllers to scale to larger visual inputs, and facilitate the use of smaller patches.
In addition, we propose a new efficient algorithm approximating softmax attention with what we call hybrid random features.
arXiv Detail & Related papers (2021-02-08T17:00:26Z) - Heterogeneous Contrastive Learning: Encoding Spatial Information for
Compact Visual Representations [183.03278932562438]
This paper presents an effective approach that adds spatial information to the encoding stage to alleviate the learning inconsistency between the contrastive objective and strong data augmentation operations.
We show that our approach achieves higher efficiency in visual representations and thus delivers a key message to inspire the future research of self-supervised visual representation learning.
arXiv Detail & Related papers (2020-11-19T16:26:25Z) - Deep Reinforced Attention Learning for Quality-Aware Visual Recognition [73.15276998621582]
We build upon the weakly-supervised generation mechanism of intermediate attention maps in any convolutional neural networks.
We introduce a meta critic network to evaluate the quality of attention maps in the main network.
arXiv Detail & Related papers (2020-07-13T02:44:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.