SR-GNN: Spatial Relation-aware Graph Neural Network for Fine-Grained
Image Categorization
- URL: http://arxiv.org/abs/2209.02109v1
- Date: Mon, 5 Sep 2022 19:43:15 GMT
- Title: SR-GNN: Spatial Relation-aware Graph Neural Network for Fine-Grained
Image Categorization
- Authors: Asish Bera and Zachary Wharton and Yonghuai Liu and Nik Bessis and
Ardhendu Behera
- Abstract summary: We propose a method that captures subtle changes by aggregating context-aware features from most relevant image-regions.
Our approach is inspired by the recent advancement in self-attention and graph neural networks (GNNs)
It outperforms the state-of-the-art approaches by a significant margin in recognition accuracy.
- Score: 24.286426387100423
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Over the past few years, a significant progress has been made in deep
convolutional neural networks (CNNs)-based image recognition. This is mainly
due to the strong ability of such networks in mining discriminative object pose
and parts information from texture and shape. This is often inappropriate for
fine-grained visual classification (FGVC) since it exhibits high intra-class
and low inter-class variances due to occlusions, deformation, illuminations,
etc. Thus, an expressive feature representation describing global structural
information is a key to characterize an object/ scene. To this end, we propose
a method that effectively captures subtle changes by aggregating context-aware
features from most relevant image-regions and their importance in
discriminating fine-grained categories avoiding the bounding-box and/or
distinguishable part annotations. Our approach is inspired by the recent
advancement in self-attention and graph neural networks (GNNs) approaches to
include a simple yet effective relation-aware feature transformation and its
refinement using a context-aware attention mechanism to boost the
discriminability of the transformed feature in an end-to-end learning process.
Our model is evaluated on eight benchmark datasets consisting of fine-grained
objects and human-object interactions. It outperforms the state-of-the-art
approaches by a significant margin in recognition accuracy.
Related papers
- Point Cloud Understanding via Attention-Driven Contrastive Learning [64.65145700121442]
Transformer-based models have advanced point cloud understanding by leveraging self-attention mechanisms.
PointACL is an attention-driven contrastive learning framework designed to address these limitations.
Our method employs an attention-driven dynamic masking strategy that guides the model to focus on under-attended regions.
arXiv Detail & Related papers (2024-11-22T05:41:00Z) - DCNN: Dual Cross-current Neural Networks Realized Using An Interactive Deep Learning Discriminator for Fine-grained Objects [48.65846477275723]
This study proposes novel dual-current neural networks (DCNN) to improve the accuracy of fine-grained image classification.
The main novel design features for constructing a weakly supervised learning backbone model DCNN include (a) extracting heterogeneous data, (b) keeping the feature map resolution unchanged, (c) expanding the receptive field, and (d) fusing global representations and local features.
arXiv Detail & Related papers (2024-05-07T07:51:28Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Masked Contrastive Graph Representation Learning for Age Estimation [44.96502862249276]
This paper utilizes the property of graph representation learning in dealing with image redundancy information.
We propose a novel Masked Contrastive Graph Representation Learning (MCGRL) method for age estimation.
Experimental results on real-world face image datasets demonstrate the superiority of our proposed method over other state-of-the-art age estimation approaches.
arXiv Detail & Related papers (2023-06-16T15:53:21Z) - A-FMI: Learning Attributions from Deep Networks via Feature Map
Importance [58.708607977437794]
Gradient-based attribution methods can aid in the understanding of convolutional neural networks (CNNs)
The redundancy of attribution features and the gradient saturation problem are challenges that attribution methods still face.
We propose a new concept, feature map importance (FMI), to refine the contribution of each feature map, and a novel attribution method via FMI, to address the gradient saturation problem.
arXiv Detail & Related papers (2021-04-12T14:54:44Z) - Learning Granularity-Aware Convolutional Neural Network for Fine-Grained
Visual Classification [0.0]
We propose a novel Granularity-Aware Congrainedal Neural Network (GA-CNN) that progressively explores discriminative features.
GA-CNN does not need bounding boxes/part annotations and can be trained end-to-end.
Our approach achieves state-of-the-art performances on three benchmark datasets.
arXiv Detail & Related papers (2021-03-04T02:18:07Z) - Context Decoupling Augmentation for Weakly Supervised Semantic
Segmentation [53.49821324597837]
Weakly supervised semantic segmentation is a challenging problem that has been deeply studied in recent years.
We present a Context Decoupling Augmentation ( CDA) method to change the inherent context in which the objects appear.
To validate the effectiveness of the proposed method, extensive experiments on PASCAL VOC 2012 dataset with several alternative network architectures demonstrate that CDA can boost various popular WSSS methods to the new state-of-the-art by a large margin.
arXiv Detail & Related papers (2021-03-02T15:05:09Z) - Context-aware Attentional Pooling (CAP) for Fine-grained Visual
Classification [2.963101656293054]
Deep convolutional neural networks (CNNs) have shown a strong ability in mining discriminative object pose and parts information for image recognition.
We propose a novel context-aware attentional pooling (CAP) that effectively captures subtle changes via sub-pixel gradients.
We evaluate our approach using six state-of-the-art (SotA) backbone networks and eight benchmark datasets.
arXiv Detail & Related papers (2021-01-17T10:15:02Z) - Multi-Level Graph Convolutional Network with Automatic Graph Learning
for Hyperspectral Image Classification [63.56018768401328]
We propose a Multi-level Graph Convolutional Network (GCN) with Automatic Graph Learning method (MGCN-AGL) for HSI classification.
By employing attention mechanism to characterize the importance among spatially neighboring regions, the most relevant information can be adaptively incorporated to make decisions.
Our MGCN-AGL encodes the long range dependencies among image regions based on the expressive representations that have been produced at local level.
arXiv Detail & Related papers (2020-09-19T09:26:20Z) - Ventral-Dorsal Neural Networks: Object Detection via Selective Attention [51.79577908317031]
We propose a new framework called Ventral-Dorsal Networks (VDNets)
Inspired by the structure of the human visual system, we propose the integration of a "Ventral Network" and a "Dorsal Network"
Our experimental results reveal that the proposed method outperforms state-of-the-art object detection approaches.
arXiv Detail & Related papers (2020-05-15T23:57:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.