Related papers: Improving Fine-Grained Visual Recognition in Low Data Regimes via Self-Boosting Attention Mechanism

Improving Fine-Grained Visual Recognition in Low Data Regimes via Self-Boosting Attention Mechanism

URL: http://arxiv.org/abs/2208.00617v1
Date: Mon, 1 Aug 2022 05:36:27 GMT
Title: Improving Fine-Grained Visual Recognition in Low Data Regimes via Self-Boosting Attention Mechanism
Authors: Yangyang Shu, Baosheng Yu, Haiming Xu, Lingqiao Liu
Abstract summary: Self-boosting attention mechanism (SAM) is a novel method for regularizing the network to focus on the key regions shared across samples and classes. We develop a variant by using SAM to create multiple attention maps to pool convolutional maps in a style of bilinear pooling.
Score: 27.628260249895973
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The challenge of fine-grained visual recognition often lies in discovering the key discriminative regions. While such regions can be automatically identified from a large-scale labeled dataset, a similar method might become less effective when only a few annotations are available. In low data regimes, a network often struggles to choose the correct regions for recognition and tends to overfit spurious correlated patterns from the training data. To tackle this issue, this paper proposes the self-boosting attention mechanism, a novel method for regularizing the network to focus on the key regions shared across samples and classes. Specifically, the proposed method first generates an attention map for each training image, highlighting the discriminative part for identifying the ground-truth object category. Then the generated attention maps are used as pseudo-annotations. The network is enforced to fit them as an auxiliary task. We call this approach the self-boosting attention mechanism (SAM). We also develop a variant by using SAM to create multiple attention maps to pool convolutional maps in a style of bilinear pooling, dubbed SAM-Bilinear. Through extensive experimental studies, we show that both methods can significantly improve fine-grained visual recognition performance on low data regimes and can be incorporated into existing network architectures. The source code is publicly available at: https://github.com/GANPerf/SAM

Related papers

Active Learning Classification from a Signal Separation Perspective [0.0]
We propose a novel clustering and classification framework inspired by the principles of signal separation. We validate our method on real-world hyperspectral datasets Salinas and Indian Pines.
arXiv Detail & Related papers (2025-02-23T03:47:03Z)
MapSAM: Adapting Segment Anything Model for Automated Feature Detection in Historical Maps [6.414068793245697]
We introduce MapSAM, a parameter-efficient fine-tuning strategy that adapts SAM into a prompt-free and versatile solution for historical map segmentation tasks. Specifically, we employ Weight-Decomposed Low-Rank Adaptation (DoRA) to integrate domain-specific knowledge into the image encoder. We develop an automatic prompt generation process, eliminating the need for manual input.
arXiv Detail & Related papers (2024-11-11T13:18:45Z)
Improving Weakly-Supervised Object Localization Using Adversarial Erasing and Pseudo Label [7.400926717561454]
This paper investigates a framework for weakly-supervised object localization. It aims to train a neural network capable of predicting both the object class and its location using only images and their image-level class labels.
arXiv Detail & Related papers (2024-04-15T06:02:09Z)
Deep Homography Estimation for Visual Place Recognition [49.235432979736395]
We propose a transformer-based deep homography estimation (DHE) network. It takes the dense feature map extracted by a backbone network as input and fits homography for fast and learnable geometric verification. Experiments on benchmark datasets show that our method can outperform several state-of-the-art methods.
arXiv Detail & Related papers (2024-02-25T13:22:17Z)
LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of Feature Similarity [49.84167231111667]
Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image. We introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion. We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations.
arXiv Detail & Related papers (2022-04-06T17:48:18Z)
Learning to Detect Instance-level Salient Objects Using Complementary Image Labels [55.049347205603304]
We present the first weakly-supervised approach to the salient instance detection problem. We propose a novel weakly-supervised network with three branches: a Saliency Detection Branch leveraging class consistency information to locate candidate objects; a Boundary Detection Branch exploiting class discrepancy information to delineate object boundaries; and a Centroid Detection Branch using subitizing information to detect salient instance centroids.
arXiv Detail & Related papers (2021-11-19T10:15:22Z)
Clustering augmented Self-Supervised Learning: Anapplication to Land Cover Mapping [10.720852987343896]
We introduce a new method for land cover mapping by using a clustering based pretext task for self-supervised learning. We demonstrate the effectiveness of the method on two societally relevant applications.
arXiv Detail & Related papers (2021-08-16T19:35:43Z)
Distribution Alignment: A Unified Framework for Long-tail Visual Recognition [52.36728157779307]
We propose a unified distribution alignment strategy for long-tail visual recognition. We then introduce a generalized re-weight method in the two-stage learning to balance the class prior. Our approach achieves the state-of-the-art results across all four recognition tasks with a simple and unified framework.
arXiv Detail & Related papers (2021-03-30T14:09:53Z)
Attentive WaveBlock: Complementarity-enhanced Mutual Networks for Unsupervised Domain Adaptation in Person Re-identification and Beyond [97.25179345878443]
This paper proposes a novel light-weight module, the Attentive WaveBlock (AWB) AWB can be integrated into the dual networks of mutual learning to enhance the complementarity and further depress noise in the pseudo-labels. Experiments demonstrate that the proposed method achieves state-of-the-art performance with significant improvements on multiple UDA person re-identification tasks.
arXiv Detail & Related papers (2020-06-11T15:40:40Z)
Weakly-Supervised Salient Object Detection via Scribble Annotations [54.40518383782725]
We propose a weakly-supervised salient object detection model to learn saliency from scribble labels. We present a new metric, termed saliency structure measure, to measure the structure alignment of the predicted saliency maps. Our method not only outperforms existing weakly-supervised/unsupervised methods, but also is on par with several fully-supervised state-of-the-art models.
arXiv Detail & Related papers (2020-03-17T12:59:50Z)
SpotNet: Self-Attention Multi-Task Network for Object Detection [11.444576186559487]
We produce foreground/background segmentation labels in a semi-supervised way, using background subtraction or optical flow. We use those segmentation maps inside the network as a self-attention mechanism to weight the feature map used to produce the bounding boxes. We show that by using this method, we obtain a significant mAP improvement on two traffic surveillance datasets.
arXiv Detail & Related papers (2020-02-13T14:43:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.