LoDisc: Learning Global-Local Discriminative Features for
Self-Supervised Fine-Grained Visual Recognition
- URL: http://arxiv.org/abs/2403.04066v1
- Date: Wed, 6 Mar 2024 21:36:38 GMT
- Title: LoDisc: Learning Global-Local Discriminative Features for
Self-Supervised Fine-Grained Visual Recognition
- Authors: Jialu Shi, Zhiqiang Wei, Jie Nie, Lei Huang
- Abstract summary: We present to incorporate the subtle local fine-grained feature learning into global self-supervised contrastive learning.
A novel pretext task called Local Discrimination (LoDisc) is proposed to explicitly supervise self-supervised model's focus towards local pivotal regions.
We show that Local Discrimination pretext task can effectively enhance fine-grained clues in important local regions, and the global-local framework further refines the fine-grained feature representations of images.
- Score: 18.442966979622717
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised contrastive learning strategy has attracted remarkable
attention due to its exceptional ability in representation learning. However,
current contrastive learning tends to learn global coarse-grained
representations of the image that benefit generic object recognition, whereas
such coarse-grained features are insufficient for fine-grained visual
recognition. In this paper, we present to incorporate the subtle local
fine-grained feature learning into global self-supervised contrastive learning
through a pure self-supervised global-local fine-grained contrastive learning
framework. Specifically, a novel pretext task called Local Discrimination
(LoDisc) is proposed to explicitly supervise self-supervised model's focus
towards local pivotal regions which are captured by a simple-but-effective
location-wise mask sampling strategy. We show that Local Discrimination pretext
task can effectively enhance fine-grained clues in important local regions, and
the global-local framework further refines the fine-grained feature
representations of images. Extensive experimental results on different
fine-grained object recognition tasks demonstrate that the proposed method can
lead to a decent improvement in different evaluation settings. Meanwhile, the
proposed method is also effective in general object recognition tasks.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.