The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image
Classification
- URL: http://arxiv.org/abs/2002.04264v3
- Date: Tue, 10 Aug 2021 04:23:56 GMT
- Title: The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image
Classification
- Authors: Dongliang Chang, Yifeng Ding, Jiyang Xie, Ayan Kumar Bhunia, Xiaoxu
Li, Zhanyu Ma, Ming Wu, Jun Guo, Yi-Zhe Song
- Abstract summary: Key for solving fine-grained image categorization is finding discriminate and local regions that correspond to subtle visual traits.
In this paper, we show it is possible to cultivate subtle details without the need for overly complicated network designs or training mechanisms.
The proposed loss function, termed as mutual-channel loss (MC-Loss), consists of two channel-specific components.
- Score: 67.79883226015824
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Key for solving fine-grained image categorization is finding discriminate and
local regions that correspond to subtle visual traits. Great strides have been
made, with complex networks designed specifically to learn part-level
discriminate feature representations. In this paper, we show it is possible to
cultivate subtle details without the need for overly complicated network
designs or training mechanisms -- a single loss is all it takes. The main trick
lies with how we delve into individual feature channels early on, as opposed to
the convention of starting from a consolidated feature map. The proposed loss
function, termed as mutual-channel loss (MC-Loss), consists of two
channel-specific components: a discriminality component and a diversity
component. The discriminality component forces all feature channels belonging
to the same class to be discriminative, through a novel channel-wise attention
mechanism. The diversity component additionally constraints channels so that
they become mutually exclusive on spatial-wise. The end result is therefore a
set of feature channels that each reflects different locally discriminative
regions for a specific class. The MC-Loss can be trained end-to-end, without
the need for any bounding-box/part annotations, and yields highly
discriminative regions during inference. Experimental results show our MC-Loss
when implemented on top of common base networks can achieve state-of-the-art
performance on all four fine-grained categorization datasets (CUB-Birds,
FGVC-Aircraft, Flowers-102, and Stanford-Cars). Ablative studies further
demonstrate the superiority of MC-Loss when compared with other recently
proposed general-purpose losses for visual classification, on two different
base networks. Code available at
https://github.com/dongliangchang/Mutual-Channel-Loss
Related papers
- Interpreting Class Conditional GANs with Channel Awareness [57.01413866290279]
We investigate how a class conditional generator unifies the synthesis of multiple classes.
To describe such a phenomenon, we propose channel awareness, which quantitatively characterizes how a single channel contributes to the final synthesis.
Our algorithm enables several novel applications with conditional GANs.
arXiv Detail & Related papers (2022-03-21T17:53:22Z) - Frequency-aware Discriminative Feature Learning Supervised by
Single-Center Loss for Face Forgery Detection [89.43987367139724]
Face forgery detection is raising ever-increasing interest in computer vision.
Recent works have reached sound achievements, but there are still unignorable problems.
A novel frequency-aware discriminative feature learning framework is proposed in this paper.
arXiv Detail & Related papers (2021-03-16T14:17:17Z) - Fine-Grained Visual Classification via Simultaneously Learning of
Multi-regional Multi-grained Features [15.71408474557042]
Fine-grained visual classification is a challenging task that recognizes the sub-classes belonging to the same meta-class.
In this paper, we argue that mining multi-regional multi-grained features is precisely the key to this task.
Experimental results over four widely used fine-grained image classification datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2021-01-31T03:46:10Z) - Progressive Co-Attention Network for Fine-grained Visual Classification [20.838908090777885]
Fine-grained visual classification aims to recognize images belonging to multiple sub-categories within a same category.
Most existing methods only take individual image as input.
We propose an effective method called progressive co-attention network (PCA-Net) to tackle this problem.
arXiv Detail & Related papers (2021-01-21T10:19:02Z) - Knowledge Transfer Based Fine-grained Visual Classification [19.233180617535492]
Fine-grained visual classification (FGVC) aims to distinguish the sub-classes of the same category.
Its essential solution is to mine the subtle and discriminative regions.
CNNs, which employ the cross entropy loss (CE-loss) as the loss function, show poor performance.
arXiv Detail & Related papers (2020-12-21T14:41:08Z) - Channel-wise Knowledge Distillation for Dense Prediction [73.99057249472735]
We propose to align features channel-wise between the student and teacher networks.
We consistently achieve superior performance on three benchmarks with various network structures.
arXiv Detail & Related papers (2020-11-26T12:00:38Z) - CC-Loss: Channel Correlation Loss For Image Classification [35.43152123975516]
The channel correlation loss (CC-Loss) is able to constrain the specific relations between classes and channels.
Two different backbone models trained with the proposed CC-Loss outperform the state-of-the-art loss functions on three image classification datasets.
arXiv Detail & Related papers (2020-10-12T05:59:06Z) - Universal-to-Specific Framework for Complex Action Recognition [114.78468658086572]
We propose an effective universal-to-specific (U2S) framework for complex action recognition.
The U2S framework is composed of threeworks: a universal network, a category-specific network, and a mask network.
Experiments on a variety of benchmark datasets demonstrate the effectiveness of the U2S framework.
arXiv Detail & Related papers (2020-07-13T01:49:07Z) - Adaptive feature recombination and recalibration for semantic
segmentation with Fully Convolutional Networks [57.64866581615309]
We propose recombination of features and a spatially adaptive recalibration block that is adapted for semantic segmentation with Fully Convolutional Networks.
Results indicate that Recombination and Recalibration improve the results of a competitive baseline, and generalize across three different problems.
arXiv Detail & Related papers (2020-06-19T15:45:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.