Iterative Superquadric Recomposition of 3D Objects from Multiple Views
- URL: http://arxiv.org/abs/2309.02102v1
- Date: Tue, 5 Sep 2023 10:21:37 GMT
- Title: Iterative Superquadric Recomposition of 3D Objects from Multiple Views
- Authors: Stephan Alaniz, Massimiliano Mancini, Zeynep Akata
- Abstract summary: We propose a framework, ISCO, to recompose an object using 3D superquadrics as semantic parts directly from 2D views.
Our framework iteratively adds new superquadrics wherever the reconstruction error is high.
It provides consistently more accurate 3D reconstructions, even from images in the wild.
- Score: 77.53142165205283
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans are good at recomposing novel objects, i.e. they can identify
commonalities between unknown objects from general structure to finer detail,
an ability difficult to replicate by machines. We propose a framework, ISCO, to
recompose an object using 3D superquadrics as semantic parts directly from 2D
views without training a model that uses 3D supervision. To achieve this, we
optimize the superquadric parameters that compose a specific instance of the
object, comparing its rendered 3D view and 2D image silhouette. Our ISCO
framework iteratively adds new superquadrics wherever the reconstruction error
is high, abstracting first coarse regions and then finer details of the target
object. With this simple coarse-to-fine inductive bias, ISCO provides
consistent superquadrics for related object parts, despite not having any
semantic supervision. Since ISCO does not train any neural network, it is also
inherently robust to out-of-distribution objects. Experiments show that,
compared to recent single instance superquadrics reconstruction approaches,
ISCO provides consistently more accurate 3D reconstructions, even from images
in the wild. Code available at https://github.com/ExplainableML/ISCO .
Related papers
- Anything-3D: Towards Single-view Anything Reconstruction in the Wild [61.090129285205805]
We introduce Anything-3D, a methodical framework that ingeniously combines a series of visual-language models and the Segment-Anything object segmentation model.
Our approach employs a BLIP model to generate textural descriptions, utilize the Segment-Anything model for the effective extraction of objects of interest, and leverages a text-to-image diffusion model to lift object into a neural radiance field.
arXiv Detail & Related papers (2023-04-19T16:39:51Z) - Monocular 3D Object Reconstruction with GAN Inversion [122.96094885939146]
MeshInversion is a novel framework to improve the reconstruction of textured 3D meshes.
It exploits the generative prior of a 3D GAN pre-trained for 3D textured mesh synthesis.
Our framework obtains faithful 3D reconstructions with consistent geometry and texture across both observed and unobserved parts.
arXiv Detail & Related papers (2022-07-20T17:47:22Z) - DensePose 3D: Lifting Canonical Surface Maps of Articulated Objects to
the Third Dimension [71.71234436165255]
We contribute DensePose 3D, a method that can learn such reconstructions in a weakly supervised fashion from 2D image annotations only.
Because it does not require 3D scans, DensePose 3D can be used for learning a wide range of articulated categories such as different animal species.
We show significant improvements compared to state-of-the-art non-rigid structure-from-motion baselines on both synthetic and real data on categories of humans and animals.
arXiv Detail & Related papers (2021-08-31T18:33:55Z) - AutoSweep: Recovering 3D Editable Objectsfrom a Single Photograph [54.701098964773756]
We aim to recover 3D objects with semantic parts and can be directly edited.
Our work makes an attempt towards recovering two types of primitive-shaped objects, namely, generalized cuboids and generalized cylinders.
Our algorithm can recover high quality 3D models and outperforms existing methods in both instance segmentation and 3D reconstruction.
arXiv Detail & Related papers (2020-05-27T12:16:24Z) - CoReNet: Coherent 3D scene reconstruction from a single RGB image [43.74240268086773]
We build on advances in deep learning to reconstruct the shape of a single object given only one RBG image as input.
We propose three extensions: (1) ray-traced skip connections that propagate local 2D information to the output 3D volume in a physically correct manner; (2) a hybrid 3D volume representation that enables building translation equivariant models; and (3) a reconstruction loss tailored to capture overall object geometry.
We reconstruct all objects jointly in one pass, producing a coherent reconstruction, where all objects live in a single consistent 3D coordinate frame relative to the camera and they do not intersect in 3D space.
arXiv Detail & Related papers (2020-04-27T17:53:07Z) - Learning Unsupervised Hierarchical Part Decomposition of 3D Objects from
a Single RGB Image [102.44347847154867]
We propose a novel formulation that allows to jointly recover the geometry of a 3D object as a set of primitives.
Our model recovers the higher level structural decomposition of various objects in the form of a binary tree of primitives.
Our experiments on the ShapeNet and D-FAUST datasets demonstrate that considering the organization of parts indeed facilitates reasoning about 3D geometry.
arXiv Detail & Related papers (2020-04-02T17:58:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.