R2-Trans:Fine-Grained Visual Categorization with Redundancy Reduction
- URL: http://arxiv.org/abs/2204.10095v1
- Date: Thu, 21 Apr 2022 13:35:38 GMT
- Title: R2-Trans:Fine-Grained Visual Categorization with Redundancy Reduction
- Authors: Yu Wang, Shuo Ye, Shujian Yu, Xinge You
- Abstract summary: Fine-grained visual categorization (FGVC) aims to discriminate similar subcategories, whose main challenge is the large intraclass diversities and subtle inter-class differences.
We present a novel approach for FGVC, which can simultaneously make use of partial yet sufficient discriminative information in environmental cues and also compress the redundant information in class-token with respect to the target.
- Score: 21.11038841356125
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fine-grained visual categorization (FGVC) aims to discriminate similar
subcategories, whose main challenge is the large intraclass diversities and
subtle inter-class differences. Existing FGVC methods usually select
discriminant regions found by a trained model, which is prone to neglect other
potential discriminant information. On the other hand, the massive interactions
between the sequence of image patches in ViT make the resulting class-token
contain lots of redundant information, which may also impacts FGVC performance.
In this paper, we present a novel approach for FGVC, which can simultaneously
make use of partial yet sufficient discriminative information in environmental
cues and also compress the redundant information in class-token with respect to
the target. Specifically, our model calculates the ratio of high-weight regions
in a batch, adaptively adjusts the masking threshold and achieves moderate
extraction of background information in the input space. Moreover, we also use
the Information Bottleneck~(IB) approach to guide our network to learn a
minimum sufficient representations in the feature space. Experimental results
on three widely-used benchmark datasets verify that our approach can achieve
outperforming performance than other state-of-the-art approaches and baseline
models.
Related papers
- Detail Reinforcement Diffusion Model: Augmentation Fine-Grained Visual Categorization in Few-Shot Conditions [11.121652649243119]
Diffusion models have been widely adopted in data augmentation due to their outstanding diversity in data generation.
We propose a novel approach termed the detail reinforcement diffusion model(DRDM)
It leverages the rich knowledge of large models for fine-grained data augmentation and comprises two key components including discriminative semantic recombination (DSR) and spatial knowledge reference(SKR)
arXiv Detail & Related papers (2023-09-15T01:28:59Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Salient Mask-Guided Vision Transformer for Fine-Grained Classification [48.1425692047256]
Fine-grained visual classification (FGVC) is a challenging computer vision problem.
One of its main difficulties is capturing the most discriminative inter-class variances.
We introduce a simple yet effective Salient Mask-Guided Vision Transformer (SM-ViT)
arXiv Detail & Related papers (2023-05-11T19:24:33Z) - Class-Specific Variational Auto-Encoder for Content-Based Image
Retrieval [95.42181254494287]
We propose a regularized loss for Variational Auto-Encoders (VAEs) forcing the model to focus on a given class of interest.
As a result, the model learns to discriminate the data belonging to the class of interest from any other possibility.
Experimental results show that the proposed method outperforms its competition in both in-domain and out-of-domain retrieval problems.
arXiv Detail & Related papers (2023-04-23T19:51:25Z) - A Compositional Feature Embedding and Similarity Metric for
Ultra-Fine-Grained Visual Categorization [16.843126268445726]
Fine-grained visual categorization (FGVC) aims at classifying objects with small inter-class variances.
This paper proposes a novel compositional feature embedding and similarity metric ( CECS) for ultra-fine-grained visual categorization.
Experimental results on two ultra-FGVC datasets and one FGVC dataset with recent benchmark methods consistently demonstrate that the proposed CECS method achieves the state-the-art performance.
arXiv Detail & Related papers (2021-09-25T15:05:25Z) - Cross-layer Navigation Convolutional Neural Network for Fine-grained
Visual Classification [21.223130735592516]
Fine-grained visual classification (FGVC) aims to classify sub-classes of objects in the same super-class.
For the FGVC tasks, the essential solution is to find discriminative subtle information of the target from local regions.
We propose cross-layer navigation convolutional neural network for feature fusion.
arXiv Detail & Related papers (2021-06-21T08:38:27Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Interpretable Attention Guided Network for Fine-grained Visual
Classification [36.657203916383594]
Fine-grained visual classification (FGVC) is challenging but more critical than traditional classification tasks.
We propose an Interpretable Attention Guided Network (IAGN) for fine-grained visual classification.
arXiv Detail & Related papers (2021-03-08T12:27:51Z) - Fine-Grained Visual Classification via Progressive Multi-Granularity
Training of Jigsaw Patches [67.51747235117]
Fine-grained visual classification (FGVC) is much more challenging than traditional classification tasks.
Recent works mainly tackle this problem by focusing on how to locate the most discriminative parts.
We propose a novel framework for fine-grained visual classification to tackle these problems.
arXiv Detail & Related papers (2020-03-08T19:27:30Z) - Weakly Supervised Attention Pyramid Convolutional Neural Network for
Fine-Grained Visual Classification [71.96618723152487]
We introduce Attention Pyramid Convolutional Neural Network (AP-CNN)
AP-CNN learns both high-level semantic and low-level detailed feature representation.
It can be trained end-to-end, without the need of additional bounding box/part annotations.
arXiv Detail & Related papers (2020-02-09T12:33:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.