Channel Interaction Networks for Fine-Grained Image Categorization
- URL: http://arxiv.org/abs/2003.05235v1
- Date: Wed, 11 Mar 2020 11:51:51 GMT
- Title: Channel Interaction Networks for Fine-Grained Image Categorization
- Authors: Yu Gao, Xintong Han, Xun Wang, Weilin Huang, Matthew R. Scott
- Abstract summary: Fine-grained image categorization is challenging due to the subtle inter-class differences.
We propose a channel interaction network (CIN), which models the channel-wise interplay both within an image and across images.
Our model can be trained efficiently in an end-to-end fashion without the need of multi-stage training and testing.
- Score: 61.095320862647476
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fine-grained image categorization is challenging due to the subtle
inter-class differences.We posit that exploiting the rich relationships between
channels can help capture such differences since different channels correspond
to different semantics. In this paper, we propose a channel interaction network
(CIN), which models the channel-wise interplay both within an image and across
images. For a single image, a self-channel interaction (SCI) module is proposed
to explore channel-wise correlation within the image. This allows the model to
learn the complementary features from the correlated channels, yielding
stronger fine-grained features. Furthermore, given an image pair, we introduce
a contrastive channel interaction (CCI) module to model the cross-sample
channel interaction with a metric learning framework, allowing the CIN to
distinguish the subtle visual differences between images. Our model can be
trained efficiently in an end-to-end fashion without the need of multi-stage
training and testing. Finally, comprehensive experiments are conducted on three
publicly available benchmarks, where the proposed method consistently
outperforms the state-of-theart approaches, such as DFL-CNN (Wang, Morariu, and
Davis 2018) and NTS (Yang et al. 2018).
Related papers
- Efficient Multi-Scale Attention Module with Cross-Spatial Learning [4.046170185945849]
A novel efficient multi-scale attention (EMA) module is proposed.
We focus on retaining the information on per channel and decreasing the computational overhead.
We conduct extensive ablation studies and experiments on image classification and object detection tasks.
arXiv Detail & Related papers (2023-05-23T00:35:47Z) - Interpreting Class Conditional GANs with Channel Awareness [57.01413866290279]
We investigate how a class conditional generator unifies the synthesis of multiple classes.
To describe such a phenomenon, we propose channel awareness, which quantitatively characterizes how a single channel contributes to the final synthesis.
Our algorithm enables several novel applications with conditional GANs.
arXiv Detail & Related papers (2022-03-21T17:53:22Z) - Multi-Scale Feature Fusion: Learning Better Semantic Segmentation for
Road Pothole Detection [9.356003255288417]
This paper presents a novel pothole detection approach based on single-modal semantic segmentation.
It first extracts visual features from input images using a convolutional neural network.
A channel attention module then reweighs the channel features to enhance the consistency of different feature maps.
arXiv Detail & Related papers (2021-12-24T15:07:47Z) - Learning Contrastive Representation for Semantic Correspondence [150.29135856909477]
We propose a multi-level contrastive learning approach for semantic matching.
We show that image-level contrastive learning is a key component to encourage the convolutional features to find correspondence between similar objects.
arXiv Detail & Related papers (2021-09-22T18:34:14Z) - Relational Embedding for Few-Shot Classification [32.12002195421671]
We propose to address the problem of few-shot classification by meta-learning "what to observe" and "where to attend" in a relational perspective.
Our method leverages patterns within and between images via self-correlational representation (SCR) and cross-correlational attention (CCA)
Our Embedding Network (RENet) combines the two relational modules to learn relational embedding in an end-to-end manner.
arXiv Detail & Related papers (2021-08-22T08:44:55Z) - Speaker embeddings by modeling channel-wise correlations [16.263418635038747]
We propose an alternative pooling method, where pairwise correlations between channels for given frequencies are used as statistics.
The method is inspired by style-transfer methods in computer vision, where the style of an image, modeled by the matrix of channel-wise correlations, is transferred to another image.
By drawing analogies between image style and speaker characteristics, and between image content and phonetic sequence, we explore the use of such channel-wise correlations features to train a ResNet architecture.
arXiv Detail & Related papers (2021-04-06T15:10:14Z) - Progressive Co-Attention Network for Fine-grained Visual Classification [20.838908090777885]
Fine-grained visual classification aims to recognize images belonging to multiple sub-categories within a same category.
Most existing methods only take individual image as input.
We propose an effective method called progressive co-attention network (PCA-Net) to tackle this problem.
arXiv Detail & Related papers (2021-01-21T10:19:02Z) - Channel-wise Knowledge Distillation for Dense Prediction [73.99057249472735]
We propose to align features channel-wise between the student and teacher networks.
We consistently achieve superior performance on three benchmarks with various network structures.
arXiv Detail & Related papers (2020-11-26T12:00:38Z) - Dual Attention GANs for Semantic Image Synthesis [101.36015877815537]
We propose a novel Dual Attention GAN (DAGAN) to synthesize photo-realistic and semantically-consistent images.
We also propose two novel modules, i.e., position-wise Spatial Attention Module (SAM) and scale-wise Channel Attention Module (CAM)
DAGAN achieves remarkably better results than state-of-the-art methods, while using fewer model parameters.
arXiv Detail & Related papers (2020-08-29T17:49:01Z) - Single Image Super-Resolution via a Holistic Attention Network [87.42409213909269]
We propose a new holistic attention network (HAN) to model the holistic interdependencies among layers, channels, and positions.
The proposed HAN adaptively emphasizes hierarchical features by considering correlations among layers.
Experiments demonstrate that the proposed HAN performs favorably against the state-of-the-art single image super-resolution approaches.
arXiv Detail & Related papers (2020-08-20T04:13:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.