Mask-Guided Feature Extraction and Augmentation for Ultra-Fine-Grained
Visual Categorization
- URL: http://arxiv.org/abs/2109.07755v1
- Date: Thu, 16 Sep 2021 06:57:05 GMT
- Title: Mask-Guided Feature Extraction and Augmentation for Ultra-Fine-Grained
Visual Categorization
- Authors: Zicheng Pan, Xiaohan Yu, Miaohua Zhang, Yongsheng Gao
- Abstract summary: The Ultra-fine-grained visual categorization (Ultra-FGVC) problems have been understudied.
FGVC aims at classifying objects from the same species, while the Ultra-FGVC targets at more challenging problems of classifying images at an ultra-fine granularity.
The challenges for Ultra-FGVC mainly comes from two aspects: one is that the Ultra-FGVC often arises overfitting problems due to the lack of training samples.
A mask-guided feature extraction and feature augmentation method is proposed in this paper to extract discriminative and informative regions of images.
- Score: 15.627971638835948
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While the fine-grained visual categorization (FGVC) problems have been
greatly developed in the past years, the Ultra-fine-grained visual
categorization (Ultra-FGVC) problems have been understudied. FGVC aims at
classifying objects from the same species (very similar categories), while the
Ultra-FGVC targets at more challenging problems of classifying images at an
ultra-fine granularity where even human experts may fail to identify the visual
difference. The challenges for Ultra-FGVC mainly comes from two aspects: one is
that the Ultra-FGVC often arises overfitting problems due to the lack of
training samples; and another lies in that the inter-class variance among
images is much smaller than normal FGVC tasks, which makes it difficult to
learn discriminative features for each class. To solve these challenges, a
mask-guided feature extraction and feature augmentation method is proposed in
this paper to extract discriminative and informative regions of images which
are then used to augment the original feature map. The advantage of the
proposed method is that the feature detection and extraction model only
requires a small amount of target region samples with bounding boxes for
training, then it can automatically locate the target area for a large number
of images in the dataset at a high detection accuracy. Experimental results on
two public datasets and ten state-of-the-art benchmark methods consistently
demonstrate the effectiveness of the proposed method both visually and
quantitatively.
Related papers
- Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Salient Mask-Guided Vision Transformer for Fine-Grained Classification [48.1425692047256]
Fine-grained visual classification (FGVC) is a challenging computer vision problem.
One of its main difficulties is capturing the most discriminative inter-class variances.
We introduce a simple yet effective Salient Mask-Guided Vision Transformer (SM-ViT)
arXiv Detail & Related papers (2023-05-11T19:24:33Z) - R2-Trans:Fine-Grained Visual Categorization with Redundancy Reduction [21.11038841356125]
Fine-grained visual categorization (FGVC) aims to discriminate similar subcategories, whose main challenge is the large intraclass diversities and subtle inter-class differences.
We present a novel approach for FGVC, which can simultaneously make use of partial yet sufficient discriminative information in environmental cues and also compress the redundant information in class-token with respect to the target.
arXiv Detail & Related papers (2022-04-21T13:35:38Z) - Fine-Grained Image Analysis with Deep Learning: A Survey [146.22351342315233]
Fine-grained image analysis (FGIA) is a longstanding and fundamental problem in computer vision and pattern recognition.
This paper attempts to re-define and broaden the field of FGIA by consolidating two fundamental fine-grained research areas -- fine-grained image recognition and fine-grained image retrieval.
arXiv Detail & Related papers (2021-11-11T09:43:56Z) - A Compositional Feature Embedding and Similarity Metric for
Ultra-Fine-Grained Visual Categorization [16.843126268445726]
Fine-grained visual categorization (FGVC) aims at classifying objects with small inter-class variances.
This paper proposes a novel compositional feature embedding and similarity metric ( CECS) for ultra-fine-grained visual categorization.
Experimental results on two ultra-FGVC datasets and one FGVC dataset with recent benchmark methods consistently demonstrate that the proposed CECS method achieves the state-the-art performance.
arXiv Detail & Related papers (2021-09-25T15:05:25Z) - Enhancing Fine-Grained Classification for Low Resolution Images [97.82441158440527]
Low resolution images suffer from the inherent challenge of limited information content and the absence of fine details useful for sub-category classification.
This research proposes a novel attribute-assisted loss, which utilizes ancillary information to learn discriminative features for classification.
The proposed loss function enables a model to learn class-specific discriminative features, while incorporating attribute-level separability.
arXiv Detail & Related papers (2021-05-01T13:19:02Z) - Image-based Automated Species Identification: Can Virtual Data
Augmentation Overcome Problems of Insufficient Sampling? [0.0]
We present a two-level data augmentation approach to automated visual species identification.
The first level of data augmentation applies classic approaches of data augmentation and generation of faked images.
The second level of data augmentation employs synthetic additional sampling in feature space by an oversampling algorithm in vector space.
arXiv Detail & Related papers (2020-10-18T15:44:45Z) - Attention Model Enhanced Network for Classification of Breast Cancer
Image [54.83246945407568]
AMEN is formulated in a multi-branch fashion with pixel-wised attention model and classification submodular.
To focus more on subtle detail information, the sample image is enhanced by the pixel-wised attention map generated from former branch.
Experiments conducted on three benchmark datasets demonstrate the superiority of the proposed method under various scenarios.
arXiv Detail & Related papers (2020-10-07T08:44:21Z) - Domain Adaptive Transfer Learning on Visual Attention Aware Data
Augmentation for Fine-grained Visual Categorization [3.5788754401889014]
We perform domain adaptive knowledge transfer via fine-tuning on our base network model.
We show competitive improvement on accuracies by using attention-aware data augmentation techniques.
Our method achieves state-of-the-art results in multiple fine-grained classification datasets.
arXiv Detail & Related papers (2020-10-06T22:47:57Z) - Fine-Grained Visual Classification via Progressive Multi-Granularity
Training of Jigsaw Patches [67.51747235117]
Fine-grained visual classification (FGVC) is much more challenging than traditional classification tasks.
Recent works mainly tackle this problem by focusing on how to locate the most discriminative parts.
We propose a novel framework for fine-grained visual classification to tackle these problems.
arXiv Detail & Related papers (2020-03-08T19:27:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.