Fine-grained Visual Classification with High-temperature Refinement and
Background Suppression
- URL: http://arxiv.org/abs/2303.06442v2
- Date: Tue, 25 Apr 2023 00:51:55 GMT
- Title: Fine-grained Visual Classification with High-temperature Refinement and
Background Suppression
- Authors: Po-Yung Chou, Yu-Yung Kao, Cheng-Hung Lin
- Abstract summary: We propose a novel network called High-temperaturE Refinement and Background Suppression'' (HERBS)
HERBS fuses features of varying scales, suppresses background noise, discriminative features at appropriate scales for fine-grained visual classification.
The proposed method achieves state-of-the-art performance on the CUB-200-2011 and NABirds benchmarks, surpassing 93% accuracy on both datasets.
- Score: 0.19336815376402716
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fine-grained visual classification is a challenging task due to the high
similarity between categories and distinct differences among data within one
single category. To address the challenges, previous strategies have focused on
localizing subtle discrepancies between categories and enhencing the
discriminative features in them. However, the background also provides
important information that can tell the model which features are unnecessary or
even harmful for classification, and models that rely too heavily on subtle
features may overlook global features and contextual information. In this
paper, we propose a novel network called ``High-temperaturE Refinement and
Background Suppression'' (HERBS), which consists of two modules, namely, the
high-temperature refinement module and the background suppression module, for
extracting discriminative features and suppressing background noise,
respectively. The high-temperature refinement module allows the model to learn
the appropriate feature scales by refining the features map at different scales
and improving the learning of diverse features. And, the background suppression
module first splits the features map into foreground and background using
classification confidence scores and suppresses feature values in
low-confidence areas while enhancing discriminative features. The experimental
results show that the proposed HERBS effectively fuses features of varying
scales, suppresses background noise, discriminative features at appropriate
scales for fine-grained visual classification.The proposed method achieves
state-of-the-art performance on the CUB-200-2011 and NABirds benchmarks,
surpassing 93% accuracy on both datasets. Thus, HERBS presents a promising
solution for improving the performance of fine-grained visual classification
tasks. code: https://github.com/chou141253/FGVC-HERBS
Related papers
- Semantic Feature Integration network for Fine-grained Visual
Classification [5.182627302449368]
We propose the Semantic Feature Integration network (SFI-Net) to address the above difficulties.
By eliminating unnecessary features and reconstructing the semantic relations among discriminative features, our SFI-Net has achieved satisfying performance.
arXiv Detail & Related papers (2023-02-13T07:32:25Z) - Boosting Few-shot Fine-grained Recognition with Background Suppression
and Foreground Alignment [53.401889855278704]
Few-shot fine-grained recognition (FS-FGR) aims to recognize novel fine-grained categories with the help of limited available samples.
We propose a two-stage background suppression and foreground alignment framework, which is composed of a background activation suppression (BAS) module, a foreground object alignment (FOA) module, and a local to local (L2L) similarity metric.
Experiments conducted on multiple popular fine-grained benchmarks demonstrate that our method outperforms the existing state-of-the-art by a large margin.
arXiv Detail & Related papers (2022-10-04T07:54:40Z) - Fine-grained Retrieval Prompt Tuning [149.9071858259279]
Fine-grained Retrieval Prompt Tuning steers a frozen pre-trained model to perform the fine-grained retrieval task from the perspectives of sample prompt and feature adaptation.
Our FRPT with fewer learnable parameters achieves the state-of-the-art performance on three widely-used fine-grained datasets.
arXiv Detail & Related papers (2022-07-29T04:10:04Z) - R2-Trans:Fine-Grained Visual Categorization with Redundancy Reduction [21.11038841356125]
Fine-grained visual categorization (FGVC) aims to discriminate similar subcategories, whose main challenge is the large intraclass diversities and subtle inter-class differences.
We present a novel approach for FGVC, which can simultaneously make use of partial yet sufficient discriminative information in environmental cues and also compress the redundant information in class-token with respect to the target.
arXiv Detail & Related papers (2022-04-21T13:35:38Z) - A Compositional Feature Embedding and Similarity Metric for
Ultra-Fine-Grained Visual Categorization [16.843126268445726]
Fine-grained visual categorization (FGVC) aims at classifying objects with small inter-class variances.
This paper proposes a novel compositional feature embedding and similarity metric ( CECS) for ultra-fine-grained visual categorization.
Experimental results on two ultra-FGVC datasets and one FGVC dataset with recent benchmark methods consistently demonstrate that the proposed CECS method achieves the state-the-art performance.
arXiv Detail & Related papers (2021-09-25T15:05:25Z) - Enhancing Fine-Grained Classification for Low Resolution Images [97.82441158440527]
Low resolution images suffer from the inherent challenge of limited information content and the absence of fine details useful for sub-category classification.
This research proposes a novel attribute-assisted loss, which utilizes ancillary information to learn discriminative features for classification.
The proposed loss function enables a model to learn class-specific discriminative features, while incorporating attribute-level separability.
arXiv Detail & Related papers (2021-05-01T13:19:02Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Saliency-driven Class Impressions for Feature Visualization of Deep
Neural Networks [55.11806035788036]
It is advantageous to visualize the features considered to be essential for classification.
Existing visualization methods develop high confidence images consisting of both background and foreground features.
In this work, we propose a saliency-driven approach to visualize discriminative features that are considered most important for a given task.
arXiv Detail & Related papers (2020-07-31T06:11:06Z) - Capturing scattered discriminative information using a deep architecture
in acoustic scene classification [49.86640645460706]
In this study, we investigate various methods to capture discriminative information and simultaneously mitigate the overfitting problem.
We adopt a max feature map method to replace conventional non-linear activations in a deep neural network.
Two data augment methods and two deep architecture modules are further explored to reduce overfitting and sustain the system's discriminative power.
arXiv Detail & Related papers (2020-07-09T08:32:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.