Investigating Image Manifolds of 3D Objects: Learning, Shape Analysis, and Comparisons
- URL: http://arxiv.org/abs/2503.06773v1
- Date: Sun, 09 Mar 2025 21:00:33 GMT
- Title: Investigating Image Manifolds of 3D Objects: Learning, Shape Analysis, and Comparisons
- Authors: Benjamin Beaudett, Shenyuan Liang, Anuj Srivastava,
- Abstract summary: Despite high-dimensionality of images, the sets of images of 3D objects have long been hypothesized to form low-dimensional manifold.<n>This paper revisits a classical problem of manifold learning but from a novel geometrical perspective.<n>The geometries of image manifold can be exploited to simplify vision and image processing tasks, to predict performances, and to provide insights into learning methods.
- Score: 9.326260051834822
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite high-dimensionality of images, the sets of images of 3D objects have long been hypothesized to form low-dimensional manifolds. What is the nature of such manifolds? How do they differ across objects and object classes? Answering these questions can provide key insights in explaining and advancing success of machine learning algorithms in computer vision. This paper investigates dual tasks -- learning and analyzing shapes of image manifolds -- by revisiting a classical problem of manifold learning but from a novel geometrical perspective. It uses geometry-preserving transformations to map the pose image manifolds, sets of images formed by rotating 3D objects, to low-dimensional latent spaces. The pose manifolds of different objects in latent spaces are found to be nonlinear, smooth manifolds. The paper then compares shapes of these manifolds for different objects using Kendall's shape analysis, modulo rigid motions and global scaling, and clusters objects according to these shape metrics. Interestingly, pose manifolds for objects from the same classes are frequently clustered together. The geometries of image manifolds can be exploited to simplify vision and image processing tasks, to predict performances, and to provide insights into learning methods.
Related papers
- Learning Pose Image Manifolds Using Geometry-Preserving GANs and
Elasticae [13.202747831999414]
Geometric Style-GAN (Geom-SGAN) maps images to low-dimensional latent representations.
Euler's elastica smoothly interpolate between directed points (points + tangent directions) in the low-dimensional latent space.
arXiv Detail & Related papers (2023-05-17T18:45:56Z) - Geo-SIC: Learning Deformable Geometric Shapes in Deep Image Classifiers [8.781861951759948]
This paper presents Geo-SIC, the first deep learning model to learn deformable shapes in a deformation space for an improved performance of image classification.
We introduce a newly designed framework that (i) simultaneously derives features from both image and latent shape spaces with large intra-class variations.
We develop a boosted classification network, equipped with an unsupervised learning of geometric shape representations.
arXiv Detail & Related papers (2022-10-25T01:55:17Z) - Shadows Shed Light on 3D Objects [23.14510850163136]
We create a differentiable image formation model that allows us to infer the 3D shape of an object, its pose, and the position of a light source.
Our approach is robust to real-world images where ground-truth shadow mask is unknown.
arXiv Detail & Related papers (2022-06-17T19:58:11Z) - Discovering 3D Parts from Image Collections [98.16987919686709]
We tackle the problem of 3D part discovery from only 2D image collections.
Instead of relying on manually annotated parts for supervision, we propose a self-supervised approach.
Our key insight is to learn a novel part shape prior that allows each part to fit an object shape faithfully while constrained to have simple geometry.
arXiv Detail & Related papers (2021-07-28T20:29:16Z) - Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose
Estimation [44.8872454995923]
We present a novel approach for scalable 6D pose estimation, by self-supervised learning on synthetic data of multiple objects using a single autoencoder.
We test our method on two multi-object benchmarks with real data, T-LESS and NOCS REAL275, and show it outperforms existing RGB-based methods in terms of pose estimation accuracy and generalization.
arXiv Detail & Related papers (2021-07-27T01:55:30Z) - Sparse Pose Trajectory Completion [87.31270669154452]
We propose a method to learn, even using a dataset where objects appear only in sparsely sampled views.
This is achieved with a cross-modal pose trajectory transfer mechanism.
Our method is evaluated on the Pix3D and ShapeNet datasets.
arXiv Detail & Related papers (2021-05-01T00:07:21Z) - Continuous Surface Embeddings [76.86259029442624]
We focus on the task of learning and representing dense correspondences in deformable object categories.
We propose a new, learnable image-based representation of dense correspondences.
We demonstrate that the proposed approach performs on par or better than the state-of-the-art methods for dense pose estimation for humans.
arXiv Detail & Related papers (2020-11-24T22:52:15Z) - Canonical 3D Deformer Maps: Unifying parametric and non-parametric
methods for dense weakly-supervised category reconstruction [79.98689027127855]
We propose a new representation of the 3D shape of common object categories that can be learned from a collection of 2D images of independent objects.
Our method builds in a novel way on concepts from parametric deformation models, non-parametric 3D reconstruction, and canonical embeddings.
It achieves state-of-the-art results in dense 3D reconstruction on public in-the-wild datasets of faces, cars, and birds.
arXiv Detail & Related papers (2020-08-28T15:44:05Z) - Learning Pose-invariant 3D Object Reconstruction from Single-view Images [61.98279201609436]
In this paper, we explore a more realistic setup of learning 3D shape from only single-view images.
The major difficulty lies in insufficient constraints that can be provided by single view images.
We propose an effective adversarial domain confusion method to learn pose-disentangled compact shape space.
arXiv Detail & Related papers (2020-04-03T02:47:35Z) - Self-supervised Single-view 3D Reconstruction via Semantic Consistency [142.71430568330172]
We learn a self-supervised, single-view 3D reconstruction model that predicts the shape, texture and camera pose of a target object.
The proposed method does not necessitate 3D supervision, manually annotated keypoints, multi-view images of an object or a prior 3D template.
arXiv Detail & Related papers (2020-03-13T20:29:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.