CIGMO: Categorical invariant representations in a deep generative
framework
- URL: http://arxiv.org/abs/2205.13758v1
- Date: Fri, 27 May 2022 04:21:22 GMT
- Title: CIGMO: Categorical invariant representations in a deep generative
framework
- Authors: Haruo Hosoya
- Abstract summary: We introduce a novel deep generative model, called CIGMO, that can learn to represent category, shape, and view factors from image data.
By empirical investigation, we show that our model can effectively discover categories of object shapes despite large view variation.
- Score: 4.111899441919164
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data of general object images have two most common structures: (1) each
object of a given shape can be rendered in multiple different views, and (2)
shapes of objects can be categorized in such a way that the diversity of shapes
is much larger across categories than within a category. Existing deep
generative models can typically capture either structure, but not both. In this
work, we introduce a novel deep generative model, called CIGMO, that can learn
to represent category, shape, and view factors from image data. The model is
comprised of multiple modules of shape representations that are each
specialized to a particular category and disentangled from view representation,
and can be learned using a group-based weakly supervised learning method. By
empirical investigation, we show that our model can effectively discover
categories of object shapes despite large view variation and quantitatively
supersede various previous methods including the state-of-the-art invariant
clustering algorithm. Further, we show that our approach using
category-specialization can enhance the learned shape representation to better
perform down-stream tasks such as one-shot object identification as well as
shape-view disentanglement.
Related papers
- Towards Category Unification of 3D Single Object Tracking on Point Clouds [10.64650098374183]
Category-specific models are provenly valuable methods in 3D single object tracking (SOT) regardless of Siamese or motion-centric paradigms.
This paper first introduces unified models that can simultaneously track objects across all categories using a single network with shared model parameters.
arXiv Detail & Related papers (2024-01-20T10:38:28Z) - Category-level Shape Estimation for Densely Cluttered Objects [94.64287790278887]
We propose a category-level shape estimation method for densely cluttered objects.
Our framework partitions each object in the clutter via the multi-view visual information fusion.
Experiments in the simulated environment and real world show that our method achieves high shape estimation accuracy.
arXiv Detail & Related papers (2023-02-23T13:00:17Z) - Geo-SIC: Learning Deformable Geometric Shapes in Deep Image Classifiers [8.781861951759948]
This paper presents Geo-SIC, the first deep learning model to learn deformable shapes in a deformation space for an improved performance of image classification.
We introduce a newly designed framework that (i) simultaneously derives features from both image and latent shape spaces with large intra-class variations.
We develop a boosted classification network, equipped with an unsupervised learning of geometric shape representations.
arXiv Detail & Related papers (2022-10-25T01:55:17Z) - Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance
Consistency [59.427074701985795]
Single-view reconstruction typically rely on viewpoint annotations, silhouettes, the absence of background, multiple views of the same instance, a template shape, or symmetry.
We avoid all of these supervisions and hypotheses by leveraging explicitly the consistency between images of different object instances.
Our main contributions are two approaches to leverage cross-instance consistency: (i) progressive conditioning, a training strategy to gradually specialize the model from category to instances in a curriculum learning fashion; (ii) swap reconstruction, a loss enforcing consistency between instances having similar shape or texture.
arXiv Detail & Related papers (2022-04-21T17:47:35Z) - Multi-Category Mesh Reconstruction From Image Collections [90.24365811344987]
We present an alternative approach that infers the textured mesh of objects combining a series of deformable 3D models and a set of instance-specific deformation, pose, and texture.
Our method is trained with images of multiple object categories using only foreground masks and rough camera poses as supervision.
Experiments show that the proposed framework can distinguish between different object categories and learn category-specific shape priors in an unsupervised manner.
arXiv Detail & Related papers (2021-10-21T16:32:31Z) - PatchNets: Patch-Based Generalizable Deep Implicit 3D Shape
Representations [75.42959184226702]
We present a new mid-level patch-based surface representation for object-agnostic training.
We show several applications of our new representation, including shape and partial point cloud completion.
arXiv Detail & Related papers (2020-08-04T15:34:46Z) - Commonality-Parsing Network across Shape and Appearance for Partially
Supervised Instance Segmentation [71.59275788106622]
We propose to learn the underlying class-agnostic commonalities that can be generalized from mask-annotated categories to novel categories.
Our model significantly outperforms the state-of-the-art methods on both partially supervised setting and few-shot setting for instance segmentation on COCO dataset.
arXiv Detail & Related papers (2020-07-24T07:23:44Z) - Self-supervised Single-view 3D Reconstruction via Semantic Consistency [142.71430568330172]
We learn a self-supervised, single-view 3D reconstruction model that predicts the shape, texture and camera pose of a target object.
The proposed method does not necessitate 3D supervision, manually annotated keypoints, multi-view images of an object or a prior 3D template.
arXiv Detail & Related papers (2020-03-13T20:29:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.