On Learning Discriminative Features from Synthesized Data for Self-Supervised Fine-Grained Visual Recognition
- URL: http://arxiv.org/abs/2407.14676v1
- Date: Fri, 19 Jul 2024 21:43:19 GMT
- Title: On Learning Discriminative Features from Synthesized Data for Self-Supervised Fine-Grained Visual Recognition
- Authors: Zihu Wang, Lingqiao Liu, Scott Ricardo Figueroa Weston, Samuel Tian, Peng Li,
- Abstract summary: Self-Supervised Learning (SSL) has become a prominent approach for acquiring visual representations across various tasks.
We introduce a novel strategy that boosts SSL's ability to extract critical discriminative features vital for fine-grained visual recognition.
This approach creates synthesized data pairs to guide the model to focus on discriminative features critical for FGVR.
- Score: 21.137498023391178
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-Supervised Learning (SSL) has become a prominent approach for acquiring visual representations across various tasks, yet its application in fine-grained visual recognition (FGVR) is challenged by the intricate task of distinguishing subtle differences between categories. To overcome this, we introduce an novel strategy that boosts SSL's ability to extract critical discriminative features vital for FGVR. This approach creates synthesized data pairs to guide the model to focus on discriminative features critical for FGVR during SSL. We start by identifying non-discriminative features using two main criteria: features with low variance that fail to effectively separate data and those deemed less important by Grad-CAM induced from the SSL loss. We then introduce perturbations to these non-discriminative features while preserving discriminative ones. A decoder is employed to reconstruct images from both perturbed and original feature vectors to create data pairs. An encoder is trained on such generated data pairs to become invariant to variations in non-discriminative dimensions while focusing on discriminative features, thereby improving the model's performance in FGVR tasks. We demonstrate the promising FGVR performance of the proposed approach through extensive evaluation on a wide variety of datasets.
Related papers
- High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning [54.86882315023791]
We propose an innovative approach called High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning (HDAFL)
HDAFL utilizes multiple convolutional kernels to automatically learn discriminative regions highly correlated with attributes in images.
We also introduce a Transformer-based attribute discrimination encoder to enhance the discriminative capability among attributes.
arXiv Detail & Related papers (2024-04-07T13:17:47Z) - Detail Reinforcement Diffusion Model: Augmentation Fine-Grained Visual Categorization in Few-Shot Conditions [11.121652649243119]
Diffusion models have been widely adopted in data augmentation due to their outstanding diversity in data generation.
We propose a novel approach termed the detail reinforcement diffusion model(DRDM)
It leverages the rich knowledge of large models for fine-grained data augmentation and comprises two key components including discriminative semantic recombination (DSR) and spatial knowledge reference(SKR)
arXiv Detail & Related papers (2023-09-15T01:28:59Z) - Learning Invariant Representation via Contrastive Feature Alignment for
Clutter Robust SAR Target Recognition [10.993101256393679]
This letter proposes a solution called Contrastive Feature Alignment (CFA) to learn invariant representation for robust recognition.
CFA combines both classification and CWMSE losses to train the model jointly.
The proposed CFA combines both classification and CWMSE losses to train the model jointly, which allows for the progressive learning of invariant target representation.
arXiv Detail & Related papers (2023-04-04T12:35:33Z) - Learning Common Rationale to Improve Self-Supervised Representation for
Fine-Grained Visual Recognition Problems [61.11799513362704]
We propose learning an additional screening mechanism to identify discriminative clues commonly seen across instances and classes.
We show that a common rationale detector can be learned by simply exploiting the GradCAM induced from the SSL objective.
arXiv Detail & Related papers (2023-03-03T02:07:40Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Data-Efficient Instance Generation from Instance Discrimination [40.71055888512495]
We propose a data-efficient Instance Generation (InsGen) method based on instance discrimination.
In this work, we propose a data-efficient Instance Generation (InsGen) method based on instance discrimination.
arXiv Detail & Related papers (2021-06-08T17:52:59Z) - Style Normalization and Restitution for DomainGeneralization and
Adaptation [88.86865069583149]
An effective domain generalizable model is expected to learn feature representations that are both generalizable and discriminative.
In this paper, we design a novel Style Normalization and Restitution module (SNR) to ensure both high generalization and discrimination capability of the networks.
arXiv Detail & Related papers (2021-01-03T09:01:39Z) - Discriminative feature generation for classification of imbalanced data [6.458496335718508]
We propose a novel supervised discriminative feature generation (DFG) method for a minority class dataset.
DFG is based on the modified structure of a generative adversarial network consisting of four independent networks.
The experimental results show that the DFG generator enhances the augmentation of the label-preserved and diverse features.
arXiv Detail & Related papers (2020-10-24T12:19:05Z) - Adversarial Feature Hallucination Networks for Few-Shot Learning [84.31660118264514]
Adversarial Feature Hallucination Networks (AFHN) is based on conditional Wasserstein Generative Adversarial networks (cWGAN)
Two novel regularizers are incorporated into AFHN to encourage discriminability and diversity of the synthesized features.
arXiv Detail & Related papers (2020-03-30T02:43:16Z) - When Relation Networks meet GANs: Relation GANs with Triplet Loss [110.7572918636599]
Training stability is still a lingering concern of generative adversarial networks (GANs)
In this paper, we explore a relation network architecture for the discriminator and design a triplet loss which performs better generalization and stability.
Experiments on benchmark datasets show that the proposed relation discriminator and new loss can provide significant improvement on variable vision tasks.
arXiv Detail & Related papers (2020-02-24T11:35:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.