Interpretable and Accurate Fine-grained Recognition via Region Grouping
- URL: http://arxiv.org/abs/2005.10411v1
- Date: Thu, 21 May 2020 01:18:26 GMT
- Title: Interpretable and Accurate Fine-grained Recognition via Region Grouping
- Authors: Zixuan Huang, Yin Li
- Abstract summary: We present an interpretable deep model for fine-grained visual recognition.
At the core of our method lies the integration of region-based part discovery and attribution within a deep neural network.
Our results compare favorably to state-of-the-art methods on classification tasks.
- Score: 14.28113520947247
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an interpretable deep model for fine-grained visual recognition.
At the core of our method lies the integration of region-based part discovery
and attribution within a deep neural network. Our model is trained using
image-level object labels, and provides an interpretation of its results via
the segmentation of object parts and the identification of their contributions
towards classification. To facilitate the learning of object parts without
direct supervision, we explore a simple prior of the occurrence of object
parts. We demonstrate that this prior, when combined with our region-based part
discovery and attribution, leads to an interpretable model that remains highly
accurate. Our model is evaluated on major fine-grained recognition datasets,
including CUB-200, CelebA and iNaturalist. Our results compare favorably to
state-of-the-art methods on classification tasks, and our method outperforms
previous approaches on the localization of object parts.
Related papers
- Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - PARTICUL: Part Identification with Confidence measure using Unsupervised
Learning [0.0]
PARTICUL is a novel algorithm for unsupervised learning of part detectors from datasets used in fine-grained recognition.
It exploits the macro-similarities of all images in the training set in order to mine for recurring patterns in the feature space of a pre-trained convolutional neural network.
We show that our detectors can consistently highlight parts of the object while providing a good measure of the confidence in their prediction.
arXiv Detail & Related papers (2022-06-27T13:44:49Z) - Point-Level Region Contrast for Object Detection Pre-Training [147.47349344401806]
We present point-level region contrast, a self-supervised pre-training approach for the task of object detection.
Our approach performs contrastive learning by directly sampling individual point pairs from different regions.
Compared to an aggregated representation per region, our approach is more robust to the change in input region quality.
arXiv Detail & Related papers (2022-02-09T18:56:41Z) - Unsupervised Part Discovery from Contrastive Reconstruction [90.88501867321573]
The goal of self-supervised visual representation learning is to learn strong, transferable image representations.
We propose an unsupervised approach to object part discovery and segmentation.
Our method yields semantic parts consistent across fine-grained but visually distinct categories.
arXiv Detail & Related papers (2021-11-11T17:59:42Z) - Skeleton-Based Mutually Assisted Interacted Object Localization and
Human Action Recognition [111.87412719773889]
We propose a joint learning framework for "interacted object localization" and "human action recognition" based on skeleton data.
Our method achieves the best or competitive performance with the state-of-the-art methods for human action recognition.
arXiv Detail & Related papers (2021-10-28T10:09:34Z) - Self-supervised Segmentation via Background Inpainting [96.10971980098196]
We introduce a self-supervised detection and segmentation approach that can work with single images captured by a potentially moving camera.
We exploit a self-supervised loss function that we exploit to train a proposal-based segmentation network.
We apply our method to human detection and segmentation in images that visually depart from those of standard benchmarks and outperform existing self-supervised methods.
arXiv Detail & Related papers (2020-11-11T08:34:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.