Multi-View Active Fine-Grained Recognition
- URL: http://arxiv.org/abs/2206.01153v1
- Date: Thu, 2 Jun 2022 17:12:14 GMT
- Title: Multi-View Active Fine-Grained Recognition
- Authors: Ruoyi Du, Wenqing Yu, Heqing Wang, Dongliang Chang, Ting-En Lin,
Yongbin Li, Zhanyu Ma
- Abstract summary: Fine-grained visual classification (FGVC) is being developed for decades.
Discriminative information is not only present within seen local regions but also hides in other unseen perspectives.
We propose a policy-gradient-based framework to achieve efficient recognition with active view selection.
- Score: 29.980409725777292
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As fine-grained visual classification (FGVC) being developed for decades,
great works related have exposed a key direction -- finding discriminative
local regions and revealing subtle differences. However, unlike identifying
visual contents within static images, for recognizing objects in the real
physical world, discriminative information is not only present within seen
local regions but also hides in other unseen perspectives. In other words, in
addition to focusing on the distinguishable part from the whole, for efficient
and accurate recognition, it is required to infer the key perspective with a
few glances, e.g., people may recognize a "Benz AMG GT" with a glance of its
front and then know that taking a look at its exhaust pipe can help to tell
which year's model it is. In this paper, back to reality, we put forward the
problem of active fine-grained recognition (AFGR) and complete this study in
three steps: (i) a hierarchical, multi-view, fine-grained vehicle dataset is
collected as the testbed, (ii) a simple experiment is designed to verify that
different perspectives contribute differently for FGVC and different categories
own different discriminative perspective, (iii) a policy-gradient-based
framework is adopted to achieve efficient recognition with active view
selection. Comprehensive experiments demonstrate that the proposed method
delivers a better performance-efficient trade-off than previous FGVC methods
and advanced neural networks.
Related papers
- Salient Mask-Guided Vision Transformer for Fine-Grained Classification [48.1425692047256]
Fine-grained visual classification (FGVC) is a challenging computer vision problem.
One of its main difficulties is capturing the most discriminative inter-class variances.
We introduce a simple yet effective Salient Mask-Guided Vision Transformer (SM-ViT)
arXiv Detail & Related papers (2023-05-11T19:24:33Z) - Learning Common Rationale to Improve Self-Supervised Representation for
Fine-Grained Visual Recognition Problems [61.11799513362704]
We propose learning an additional screening mechanism to identify discriminative clues commonly seen across instances and classes.
We show that a common rationale detector can be learned by simply exploiting the GradCAM induced from the SSL objective.
arXiv Detail & Related papers (2023-03-03T02:07:40Z) - Learning Diversified Feature Representations for Facial Expression
Recognition in the Wild [97.14064057840089]
We propose a mechanism to diversify the features extracted by CNN layers of state-of-the-art facial expression recognition architectures.
Experimental results on three well-known facial expression recognition in-the-wild datasets, AffectNet, FER+, and RAF-DB, show the effectiveness of our method.
arXiv Detail & Related papers (2022-10-17T19:25:28Z) - Deep Collaborative Multi-Modal Learning for Unsupervised Kinship
Estimation [53.62256887837659]
Kinship verification is a long-standing research challenge in computer vision.
We propose a novel deep collaborative multi-modal learning (DCML) to integrate the underlying information presented in facial properties.
Our DCML method is always superior to some state-of-the-art kinship verification methods.
arXiv Detail & Related papers (2021-09-07T01:34:51Z) - Silhouette based View embeddings for Gait Recognition under Multiple
Views [46.087837374748005]
We propose a compatible framework that can embed view information into existing architectures of gait recognition.
Experimental results on two large public datasets show that the proposed framework is very effective.
arXiv Detail & Related papers (2021-08-12T04:19:04Z) - Distribution Alignment: A Unified Framework for Long-tail Visual
Recognition [52.36728157779307]
We propose a unified distribution alignment strategy for long-tail visual recognition.
We then introduce a generalized re-weight method in the two-stage learning to balance the class prior.
Our approach achieves the state-of-the-art results across all four recognition tasks with a simple and unified framework.
arXiv Detail & Related papers (2021-03-30T14:09:53Z) - View-Invariant Gait Recognition with Attentive Recurrent Learning of
Partial Representations [27.33579145744285]
We propose a network that first learns to extract gait convolutional energy maps (GCEM) from frame-level convolutional features.
It then adopts a bidirectional neural network to learn from split bins of the GCEM, thus exploiting the relations between learned partial recurrent representations.
Our proposed model has been extensively tested on two large-scale CASIA-B and OU-M gait datasets.
arXiv Detail & Related papers (2020-10-18T20:20:43Z) - Fine-Grained Visual Classification via Progressive Multi-Granularity
Training of Jigsaw Patches [67.51747235117]
Fine-grained visual classification (FGVC) is much more challenging than traditional classification tasks.
Recent works mainly tackle this problem by focusing on how to locate the most discriminative parts.
We propose a novel framework for fine-grained visual classification to tackle these problems.
arXiv Detail & Related papers (2020-03-08T19:27:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.