Feature Boosting, Suppression, and Diversification for Fine-Grained
Visual Classification
- URL: http://arxiv.org/abs/2103.02782v1
- Date: Thu, 4 Mar 2021 01:49:53 GMT
- Title: Feature Boosting, Suppression, and Diversification for Fine-Grained
Visual Classification
- Authors: Jianwei Song, Ruoyu Yang
- Abstract summary: Learning feature representation from discriminative local regions plays a key role in fine-grained visual classification.
We introduce two lightweight modules that can be easily plugged into existing convolutional neural networks.
Our method achieves state-of-the-art performances on several benchmark fine-grained datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning feature representation from discriminative local regions plays a key
role in fine-grained visual classification. Employing attention mechanisms to
extract part features has become a trend. However, there are two major
limitations in these methods: First, they often focus on the most salient part
while neglecting other inconspicuous but distinguishable parts. Second, they
treat different part features in isolation while neglecting their
relationships. To handle these limitations, we propose to locate multiple
different distinguishable parts and explore their relationships in an explicit
way. In this pursuit, we introduce two lightweight modules that can be easily
plugged into existing convolutional neural networks. On one hand, we introduce
a feature boosting and suppression module that boosts the most salient part of
feature maps to obtain a part-specific representation and suppresses it to
force the following network to mine other potential parts. On the other hand,
we introduce a feature diversification module that learns semantically
complementary information from the correlated part-specific representations.
Our method does not need bounding boxes/part annotations and can be trained
end-to-end. Extensive experimental results show that our method achieves
state-of-the-art performances on several benchmark fine-grained datasets.
Related papers
- DiffVein: A Unified Diffusion Network for Finger Vein Segmentation and
Authentication [50.017055360261665]
We introduce DiffVein, a unified diffusion model-based framework which simultaneously addresses vein segmentation and authentication tasks.
For better feature interaction between these two branches, we introduce two specialized modules.
In this way, our framework allows for a dynamic interplay between diffusion and segmentation embeddings.
arXiv Detail & Related papers (2024-02-03T06:49:42Z) - Exploring Fine-Grained Representation and Recomposition for Cloth-Changing Person Re-Identification [78.52704557647438]
We propose a novel FIne-grained Representation and Recomposition (FIRe$2$) framework to tackle both limitations without any auxiliary annotation or data.
Experiments demonstrate that FIRe$2$ can achieve state-of-the-art performance on five widely-used cloth-changing person Re-ID benchmarks.
arXiv Detail & Related papers (2023-08-21T12:59:48Z) - Semantic Prompt for Few-Shot Image Recognition [76.68959583129335]
We propose a novel Semantic Prompt (SP) approach for few-shot learning.
The proposed approach achieves promising results, improving the 1-shot learning accuracy by 3.67% on average.
arXiv Detail & Related papers (2023-03-24T16:32:19Z) - Semantic Feature Integration network for Fine-grained Visual
Classification [5.182627302449368]
We propose the Semantic Feature Integration network (SFI-Net) to address the above difficulties.
By eliminating unnecessary features and reconstructing the semantic relations among discriminative features, our SFI-Net has achieved satisfying performance.
arXiv Detail & Related papers (2023-02-13T07:32:25Z) - Part-guided Relational Transformers for Fine-grained Visual Recognition [59.20531172172135]
We propose a framework to learn the discriminative part features and explore correlations with a feature transformation module.
Our proposed approach does not rely on additional part branches and reaches state-the-of-art performance on 3-of-the-level object recognition.
arXiv Detail & Related papers (2022-12-28T03:45:56Z) - Part-aware Prototypical Graph Network for One-shot Skeleton-based Action
Recognition [57.86960990337986]
One-shot skeleton-based action recognition poses unique challenges in learning transferable representation from base classes to novel classes.
We propose a part-aware prototypical representation for one-shot skeleton-based action recognition.
We demonstrate the effectiveness of our method on two public skeleton-based action recognition datasets.
arXiv Detail & Related papers (2022-08-19T04:54:56Z) - Learning Debiased and Disentangled Representations for Semantic
Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation.
By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes.
Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z) - FINet: Dual Branches Feature Interaction for Partial-to-Partial Point
Cloud Registration [31.014309817116175]
We present FINet, a feature interaction-based structure with the capability to enable and strengthen the information associating between the inputs at multiple stages.
Experiments demonstrate that our method performs higher precision and robustness compared to the state-of-the-art traditional and learning-based methods.
arXiv Detail & Related papers (2021-06-07T10:15:02Z) - Unsupervised segmentation via semantic-apparent feature fusion [21.75371777263847]
This research proposes an unsupervised foreground segmentation method based on semantic-apparent feature fusion (SAFF)
Key regions of foreground object can be accurately responded via semantic features, while apparent features provide richer detailed expression.
By fusing semantic and apparent features, as well as cascading the modules of intra-image adaptive feature weight learning and inter-image common feature learning, the research achieves performance that significantly exceeds baselines.
arXiv Detail & Related papers (2020-05-21T08:28:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.