Unsupervised Part Discovery via Feature Alignment
- URL: http://arxiv.org/abs/2012.00313v1
- Date: Tue, 1 Dec 2020 07:25:00 GMT
- Title: Unsupervised Part Discovery via Feature Alignment
- Authors: Mengqi Guo, Yutong Bai, Zhishuai Zhang, Adam Kortylewski, Alan Yuille
- Abstract summary: We exploit the property that neural network features are largely invariant to nuisance variables.
We find a set of similar images that show instances of the same object category in the same pose, through an affine alignment of their corresponding feature maps.
During inference, part detection is simple and fast, without any extra modules or overheads other than a feed-forward neural network.
- Score: 15.67978793872039
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding objects in terms of their individual parts is important,
because it enables a precise understanding of the objects' geometrical
structure, and enhances object recognition when the object is seen in a novel
pose or under partial occlusion. However, the manual annotation of parts in
large scale datasets is time consuming and expensive. In this paper, we aim at
discovering object parts in an unsupervised manner, i.e., without ground-truth
part or keypoint annotations. Our approach builds on the intuition that objects
of the same class in a similar pose should have their parts aligned at similar
spatial locations. We exploit the property that neural network features are
largely invariant to nuisance variables and the main remaining source of
variations between images of the same object category is the object pose.
Specifically, given a training image, we find a set of similar images that show
instances of the same object category in the same pose, through an affine
alignment of their corresponding feature maps. The average of the aligned
feature maps serves as pseudo ground-truth annotation for a supervised training
of the deep network backbone. During inference, part detection is simple and
fast, without any extra modules or overheads other than a feed-forward neural
network. Our experiments on several datasets from different domains verify the
effectiveness of the proposed method. For example, we achieve 37.8 mAP on
VehiclePart, which is at least 4.2 better than previous methods.
Related papers
- PDiscoNet: Semantically consistent part discovery for fine-grained
recognition [62.12602920807109]
We propose PDiscoNet to discover object parts by using only image-level class labels along with priors encouraging the parts to be.
Our results on CUB, CelebA, and PartImageNet show that the proposed method provides substantially better part discovery performance than previous methods.
arXiv Detail & Related papers (2023-09-06T17:19:29Z) - Variable Radiance Field for Real-Life Category-Specifc Reconstruction
from Single Image [27.290232027686237]
We present a novel framework that can reconstruct category-specific objects from a single image without known camera parameters.
We parameterize the geometry and appearance of the object using a multi-scale global feature extractor.
We also propose a contrastive learning-based pretraining strategy to improve the feature extractor.
arXiv Detail & Related papers (2023-06-08T12:12:02Z) - Image Segmentation-based Unsupervised Multiple Objects Discovery [1.7674345486888503]
Unsupervised object discovery aims to localize objects in images.
We propose a fully unsupervised, bottom-up approach, for multiple objects discovery.
We provide state-of-the-art results for both unsupervised class-agnostic object detection and unsupervised image segmentation.
arXiv Detail & Related papers (2022-12-20T09:48:24Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - Discovering Objects that Can Move [55.743225595012966]
We study the problem of object discovery -- separating objects from the background without manual labels.
Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions.
We choose to focus on dynamic objects -- entities that can move independently in the world.
arXiv Detail & Related papers (2022-03-18T21:13:56Z) - Unsupervised Part Discovery from Contrastive Reconstruction [90.88501867321573]
The goal of self-supervised visual representation learning is to learn strong, transferable image representations.
We propose an unsupervised approach to object part discovery and segmentation.
Our method yields semantic parts consistent across fine-grained but visually distinct categories.
arXiv Detail & Related papers (2021-11-11T17:59:42Z) - Multi-patch Feature Pyramid Network for Weakly Supervised Object
Detection in Optical Remote Sensing Images [39.25541709228373]
We propose a new architecture for object detection with a multiple patch feature pyramid network (MPFP-Net)
MPFP-Net is different from the current models that during training only pursue the most discriminative patches.
We introduce an effective method to regularize the residual values and make the fusion transition layers strictly norm-preserving.
arXiv Detail & Related papers (2021-08-18T09:25:39Z) - A Simple and Effective Use of Object-Centric Images for Long-Tailed
Object Detection [56.82077636126353]
We take advantage of object-centric images to improve object detection in scene-centric images.
We present a simple yet surprisingly effective framework to do so.
Our approach can improve the object detection (and instance segmentation) accuracy of rare objects by 50% (and 33%) relatively.
arXiv Detail & Related papers (2021-02-17T17:27:21Z) - Look-into-Object: Self-supervised Structure Modeling for Object
Recognition [71.68524003173219]
We propose to "look into object" (explicitly yet intrinsically model the object structure) through incorporating self-supervisions.
We show the recognition backbone can be substantially enhanced for more robust representation learning.
Our approach achieves large performance gain on a number of benchmarks, including generic object recognition (ImageNet) and fine-grained object recognition tasks (CUB, Cars, Aircraft)
arXiv Detail & Related papers (2020-03-31T12:22:51Z) - OS2D: One-Stage One-Shot Object Detection by Matching Anchor Features [14.115782214599015]
One-shot object detection consists in detecting objects defined by a single demonstration.
We build the one-stage system that performs localization and recognition jointly.
Experimental evaluation on several challenging domains shows that our method can detect unseen classes.
arXiv Detail & Related papers (2020-03-15T11:39:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.