Fine-Grained Attention for Weakly Supervised Object Localization
- URL: http://arxiv.org/abs/2104.04952v1
- Date: Sun, 11 Apr 2021 08:14:05 GMT
- Title: Fine-Grained Attention for Weakly Supervised Object Localization
- Authors: Junghyo Sohn, Eunjin Jeon, Wonsik Jung, Eunsong Kang, Heung-Il Suk
- Abstract summary: We propose a novel residual fine-grained attention (RFGA) module that autonomously excites the less activated regions of an object.
We devise a series of mechanisms of triple-view attention representation, attention expansion, and feature calibration.
We validated the superiority of our proposed RFGA module by comparing it with the recent methods in the literature over three datasets.
- Score: 1.490944787606832
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although recent advances in deep learning accelerated an improvement in a
weakly supervised object localization (WSOL) task, there are still challenges
to identify the entire body of an object, rather than only discriminative
parts. In this paper, we propose a novel residual fine-grained attention (RFGA)
module that autonomously excites the less activated regions of an object by
utilizing information distributed over channels and locations within feature
maps in combination with a residual operation. To be specific, we devise a
series of mechanisms of triple-view attention representation, attention
expansion, and feature calibration. Unlike other attention-based WSOL methods
that learn a coarse attention map, having the same values across elements in
feature maps, our proposed RFGA learns fine-grained values in an attention map
by assigning different attention values for each of the elements. We validated
the superiority of our proposed RFGA module by comparing it with the recent
methods in the literature over three datasets. Further, we analyzed the effect
of each mechanism in our RFGA and visualized attention maps to get insights.
Related papers
- RefAM: Attention Magnets for Zero-Shot Referral Segmentation [103.98022860792504]
We introduce a new method that exploits features, attention scores, from diffusion transformers for downstream tasks.<n>Key insight is that stop words act as attention magnets.<n>We propose an attention redistribution strategy, where appended stop words partition background activations into smaller clusters.
arXiv Detail & Related papers (2025-09-26T17:59:57Z) - Threshold Attention Network for Semantic Segmentation of Remote Sensing Images [3.5449012582104795]
Self-attention mechanism (SA) is an effective approach for designing segmentation networks.
We propose a novel threshold attention mechanism (TAM) for semantic segmentation.
Based on TAM, we present a threshold attention network (TANet) for semantic segmentation.
arXiv Detail & Related papers (2025-01-14T10:09:55Z) - Point Cloud Understanding via Attention-Driven Contrastive Learning [64.65145700121442]
Transformer-based models have advanced point cloud understanding by leveraging self-attention mechanisms.
PointACL is an attention-driven contrastive learning framework designed to address these limitations.
Our method employs an attention-driven dynamic masking strategy that guides the model to focus on under-attended regions.
arXiv Detail & Related papers (2024-11-22T05:41:00Z) - iSeg: An Iterative Refinement-based Framework for Training-free Segmentation [85.58324416386375]
We present a deep experimental analysis on iteratively refining cross-attention map with self-attention map.
We propose an effective iterative refinement framework for training-free segmentation, named iSeg.
Our proposed iSeg achieves an absolute gain of 3.8% in terms of mIoU compared to the best existing training-free approach in literature.
arXiv Detail & Related papers (2024-09-05T03:07:26Z) - PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection.
We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN)
PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z) - Attributes Grouping and Mining Hashing for Fine-Grained Image Retrieval [24.8065557159198]
We propose an Attributes Grouping and Mining Hashing (AGMH) for fine-grained image retrieval.
AGMH groups and embeds the category-specific visual attributes in multiple descriptors to generate a comprehensive feature representation.
AGMH consistently yields the best performance against state-of-the-art methods on fine-grained benchmark datasets.
arXiv Detail & Related papers (2023-11-10T14:01:56Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - Improve the Interpretability of Attention: A Fast, Accurate, and
Interpretable High-Resolution Attention Model [6.906621279967867]
We propose a novel Bilinear Representative Non-Parametric Attention (BR-NPA) strategy that captures the task-relevant human-interpretable information.
The proposed model can be easily adapted in a wide variety of modern deep models, where classification is involved.
It is also more accurate, faster, and with a smaller memory footprint than usual neural attention modules.
arXiv Detail & Related papers (2021-06-04T15:57:37Z) - PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object
Detection [57.49788100647103]
LiDAR-based 3D object detection is an important task for autonomous driving.
Current approaches suffer from sparse and partial point clouds of distant and occluded objects.
In this paper, we propose a novel two-stage approach, namely PC-RGNN, dealing with such challenges by two specific solutions.
arXiv Detail & Related papers (2020-12-18T18:06:43Z) - Attention-based Assisted Excitation for Salient Object Detection [3.238929552408813]
We introduce a mechanism for modification of activations in feature maps of CNNs inspired by object-based attention in brain.
Similar to brain, we use the idea to address two challenges in salient object detection: gathering object interior parts while segregation from background with concise boundaries.
We implement the object-based attention in the U-net model using different architectures in the encoder parts, including AlexNet, VGG, and ResNet.
arXiv Detail & Related papers (2020-03-31T13:33:33Z) - Multi-Granularity Reference-Aided Attentive Feature Aggregation for
Video-based Person Re-identification [98.7585431239291]
Video-based person re-identification aims at matching the same person across video clips.
In this paper, we propose an attentive feature aggregation module, namely Multi-Granularity Reference-Attentive Feature aggregation module MG-RAFA.
Our framework achieves the state-of-the-art ablation performance on three benchmark datasets.
arXiv Detail & Related papers (2020-03-27T03:49:21Z) - Hybrid Multiple Attention Network for Semantic Segmentation in Aerial
Images [24.35779077001839]
We propose a novel attention-based framework named Hybrid Multiple Attention Network (HMANet) to adaptively capture global correlations.
We introduce a simple yet effective region shuffle attention (RSA) module to reduce feature redundant and improve the efficiency of self-attention mechanism.
arXiv Detail & Related papers (2020-01-09T07:47:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.