A Compositional Feature Embedding and Similarity Metric for
Ultra-Fine-Grained Visual Categorization
- URL: http://arxiv.org/abs/2109.12380v1
- Date: Sat, 25 Sep 2021 15:05:25 GMT
- Title: A Compositional Feature Embedding and Similarity Metric for
Ultra-Fine-Grained Visual Categorization
- Authors: Yajie Sun, Miaohua Zhang, Xiaohan Yu, Yi Liao, Yongsheng Gao
- Abstract summary: Fine-grained visual categorization (FGVC) aims at classifying objects with small inter-class variances.
This paper proposes a novel compositional feature embedding and similarity metric ( CECS) for ultra-fine-grained visual categorization.
Experimental results on two ultra-FGVC datasets and one FGVC dataset with recent benchmark methods consistently demonstrate that the proposed CECS method achieves the state-the-art performance.
- Score: 16.843126268445726
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fine-grained visual categorization (FGVC), which aims at classifying objects
with small inter-class variances, has been significantly advanced in recent
years. However, ultra-fine-grained visual categorization (ultra-FGVC), which
targets at identifying subclasses with extremely similar patterns, has not
received much attention. In ultra-FGVC datasets, the samples per category are
always scarce as the granularity moves down, which will lead to overfitting
problems. Moreover, the difference among different categories is too subtle to
distinguish even for professional experts. Motivated by these issues, this
paper proposes a novel compositional feature embedding and similarity metric
(CECS). Specifically, in the compositional feature embedding module, we
randomly select patches in the original input image, and these patches are then
replaced by patches from the images of different categories or masked out. Then
the replaced and masked images are used to augment the original input images,
which can provide more diverse samples and thus largely alleviate overfitting
problem resulted from limited training samples. Besides, learning with diverse
samples forces the model to learn not only the most discriminative features but
also other informative features in remaining regions, enhancing the
generalization and robustness of the model. In the compositional similarity
metric module, a new similarity metric is developed to improve the
classification performance by narrowing the intra-category distance and
enlarging the inter-category distance. Experimental results on two ultra-FGVC
datasets and one FGVC dataset with recent benchmark methods consistently
demonstrate that the proposed CECS method achieves the state of-the-art
performance.
Related papers
- EIANet: A Novel Domain Adaptation Approach to Maximize Class Distinction with Neural Collapse Principles [15.19374752514876]
Source-free domain adaptation (SFDA) aims to transfer knowledge from a labelled source domain to an unlabelled target domain.
A major challenge in SFDA is deriving accurate categorical information for the target domain.
We introduce a novel ETF-Informed Attention Network (EIANet) to separate class prototypes.
arXiv Detail & Related papers (2024-07-23T05:31:05Z) - Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model [80.61157097223058]
A prevalent strategy to bolster image classification performance is through augmenting the training set with synthetic images generated by T2I models.
In this study, we scrutinize the shortcomings of both current generative and conventional data augmentation techniques.
We introduce an innovative inter-class data augmentation method known as Diff-Mix, which enriches the dataset by performing image translations between classes.
arXiv Detail & Related papers (2024-03-28T17:23:45Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - R2-Trans:Fine-Grained Visual Categorization with Redundancy Reduction [21.11038841356125]
Fine-grained visual categorization (FGVC) aims to discriminate similar subcategories, whose main challenge is the large intraclass diversities and subtle inter-class differences.
We present a novel approach for FGVC, which can simultaneously make use of partial yet sufficient discriminative information in environmental cues and also compress the redundant information in class-token with respect to the target.
arXiv Detail & Related papers (2022-04-21T13:35:38Z) - Mask-Guided Feature Extraction and Augmentation for Ultra-Fine-Grained
Visual Categorization [15.627971638835948]
The Ultra-fine-grained visual categorization (Ultra-FGVC) problems have been understudied.
FGVC aims at classifying objects from the same species, while the Ultra-FGVC targets at more challenging problems of classifying images at an ultra-fine granularity.
The challenges for Ultra-FGVC mainly comes from two aspects: one is that the Ultra-FGVC often arises overfitting problems due to the lack of training samples.
A mask-guided feature extraction and feature augmentation method is proposed in this paper to extract discriminative and informative regions of images.
arXiv Detail & Related papers (2021-09-16T06:57:05Z) - Enhancing Fine-Grained Classification for Low Resolution Images [97.82441158440527]
Low resolution images suffer from the inherent challenge of limited information content and the absence of fine details useful for sub-category classification.
This research proposes a novel attribute-assisted loss, which utilizes ancillary information to learn discriminative features for classification.
The proposed loss function enables a model to learn class-specific discriminative features, while incorporating attribute-level separability.
arXiv Detail & Related papers (2021-05-01T13:19:02Z) - Attention Model Enhanced Network for Classification of Breast Cancer
Image [54.83246945407568]
AMEN is formulated in a multi-branch fashion with pixel-wised attention model and classification submodular.
To focus more on subtle detail information, the sample image is enhanced by the pixel-wised attention map generated from former branch.
Experiments conducted on three benchmark datasets demonstrate the superiority of the proposed method under various scenarios.
arXiv Detail & Related papers (2020-10-07T08:44:21Z) - TOAN: Target-Oriented Alignment Network for Fine-Grained Image
Categorization with Few Labeled Samples [25.68199820110267]
We propose a Target-Oriented Alignment Network (TOAN) to investigate the fine-grained relation between the target query image and support classes.
The feature of each support image is transformed to match the query ones in the embedding feature space, which reduces the disparity explicitly within each category.
arXiv Detail & Related papers (2020-05-28T07:48:44Z) - Fine-Grained Visual Classification via Progressive Multi-Granularity
Training of Jigsaw Patches [67.51747235117]
Fine-grained visual classification (FGVC) is much more challenging than traditional classification tasks.
Recent works mainly tackle this problem by focusing on how to locate the most discriminative parts.
We propose a novel framework for fine-grained visual classification to tackle these problems.
arXiv Detail & Related papers (2020-03-08T19:27:30Z) - Cross-Domain Few-Shot Classification via Learned Feature-Wise
Transformation [109.89213619785676]
Few-shot classification aims to recognize novel categories with only few labeled images in each class.
Existing metric-based few-shot classification algorithms predict categories by comparing the feature embeddings of query images with those from a few labeled images.
While promising performance has been demonstrated, these methods often fail to generalize to unseen domains.
arXiv Detail & Related papers (2020-01-23T18:55:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.