A Novel Plug-in Module for Fine-Grained Visual Classification
- URL: http://arxiv.org/abs/2202.03822v1
- Date: Tue, 8 Feb 2022 12:35:58 GMT
- Title: A Novel Plug-in Module for Fine-Grained Visual Classification
- Authors: Po-Yung Chou, Cheng-Hung Lin, Wen-Chung Kao
- Abstract summary: We propose a novel plug-in module that can be integrated to many common backbones to provide strongly discriminative regions.
Experimental results show that the proposed plugin module outperforms state-of-the-art approaches.
- Score: 0.19336815376402716
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Visual classification can be divided into coarse-grained and fine-grained
classification. Coarse-grained classification represents categories with a
large degree of dissimilarity, such as the classification of cats and dogs,
while fine-grained classification represents classifications with a large
degree of similarity, such as cat species, bird species, and the makes or
models of vehicles. Unlike coarse-grained visual classification, fine-grained
visual classification often requires professional experts to label data, which
makes data more expensive. To meet this challenge, many approaches propose to
automatically find the most discriminative regions and use local features to
provide more precise features. These approaches only require image-level
annotations, thereby reducing the cost of annotation. However, most of these
methods require two- or multi-stage architectures and cannot be trained
end-to-end. Therefore, we propose a novel plug-in module that can be integrated
to many common backbones, including CNN-based or Transformer-based networks to
provide strongly discriminative regions. The plugin module can output
pixel-level feature maps and fuse filtered features to enhance fine-grained
visual classification. Experimental results show that the proposed plugin
module outperforms state-of-the-art approaches and significantly improves the
accuracy to 92.77\% and 92.83\% on CUB200-2011 and NABirds, respectively. We
have released our source code in Github
https://github.com/chou141253/FGVC-PIM.git.
Related papers
- FAST: A Dual-tier Few-Shot Learning Paradigm for Whole Slide Image Classification [23.323845050957196]
Existing few-shot WSI classification methods only utilize a small number of fine-grained labels or weakly supervised slide labels for training.
They lack sufficient mining of available WSIs, severely limiting WSI classification performance.
We propose a novel and efficient dual-tier few-shot learning paradigm for WSI classification, named FAST.
arXiv Detail & Related papers (2024-09-29T14:31:52Z) - PDiscoNet: Semantically consistent part discovery for fine-grained
recognition [62.12602920807109]
We propose PDiscoNet to discover object parts by using only image-level class labels along with priors encouraging the parts to be.
Our results on CUB, CelebA, and PartImageNet show that the proposed method provides substantially better part discovery performance than previous methods.
arXiv Detail & Related papers (2023-09-06T17:19:29Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Efficient Subclass Segmentation in Medical Images [3.383033695275859]
One feasible way to reduce the cost is to annotate with coarse-grained superclass labels while using limited fine-grained annotations as a complement.
There is a lack of research on efficient learning of fine-grained subclasses in semantic segmentation tasks.
Our approach achieves comparable accuracy to a model trained with full subclass annotations, with limited subclass annotations and sufficient superclass annotations.
arXiv Detail & Related papers (2023-07-01T07:39:08Z) - Like a Good Nearest Neighbor: Practical Content Moderation and Text
Classification [66.02091763340094]
Like a Good Nearest Neighbor (LaGoNN) is a modification to SetFit that introduces no learnable parameters but alters input text with information from its nearest neighbor.
LaGoNN is effective at flagging undesirable content and text classification, and improves the performance of SetFit.
arXiv Detail & Related papers (2023-02-17T15:43:29Z) - Combining Metric Learning and Attention Heads For Accurate and Efficient
Multilabel Image Classification [0.0]
We revisit two popular approaches to multilabel classification: transformer-based heads and labels relations information graph processing branches.
Although transformer-based heads are considered to achieve better results than graph-based branches, we argue that with the proper training strategy graph-based methods can demonstrate just a small accuracy drop.
arXiv Detail & Related papers (2022-09-14T12:06:47Z) - Meta Learning for Few-Shot One-class Classification [0.0]
We formulate the learning of meaningful features for one-class classification as a meta-learning problem.
To learn these representations, we require only multiclass data from similar tasks.
We validate our approach by adapting few-shot classification datasets to the few-shot one-class classification scenario.
arXiv Detail & Related papers (2020-09-11T11:35:28Z) - Two-View Fine-grained Classification of Plant Species [66.75915278733197]
We propose a novel method based on a two-view leaf image representation and a hierarchical classification strategy for fine-grained recognition of plant species.
A deep metric based on Siamese convolutional neural networks is used to reduce the dependence on a large number of training samples and make the method scalable to new plant species.
arXiv Detail & Related papers (2020-05-18T21:57:47Z) - Fine-Grained Visual Classification with Efficient End-to-end
Localization [49.9887676289364]
We present an efficient localization module that can be fused with a classification network in an end-to-end setup.
We evaluate the new model on the three benchmark datasets CUB200-2011, Stanford Cars and FGVC-Aircraft.
arXiv Detail & Related papers (2020-05-11T14:07:06Z) - Weakly Supervised Attention Pyramid Convolutional Neural Network for
Fine-Grained Visual Classification [71.96618723152487]
We introduce Attention Pyramid Convolutional Neural Network (AP-CNN)
AP-CNN learns both high-level semantic and low-level detailed feature representation.
It can be trained end-to-end, without the need of additional bounding box/part annotations.
arXiv Detail & Related papers (2020-02-09T12:33:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.