LIST: Learning Implicitly from Spatial Transformers for Single-View 3D
Reconstruction
- URL: http://arxiv.org/abs/2307.12194v1
- Date: Sun, 23 Jul 2023 01:01:27 GMT
- Title: LIST: Learning Implicitly from Spatial Transformers for Single-View 3D
Reconstruction
- Authors: Mohammad Samiul Arshad and William J. Beksi
- Abstract summary: List is a novel neural architecture that leverages local and global image features to reconstruct geometric and topological structure of a 3D object from a single image.
We show the superiority of our model in reconstructing 3D objects from both synthetic and real-world images against the state of the art.
- Score: 5.107705550575662
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate reconstruction of both the geometric and topological details of a 3D
object from a single 2D image embodies a fundamental challenge in computer
vision. Existing explicit/implicit solutions to this problem struggle to
recover self-occluded geometry and/or faithfully reconstruct topological shape
structures. To resolve this dilemma, we introduce LIST, a novel neural
architecture that leverages local and global image features to accurately
reconstruct the geometric and topological structure of a 3D object from a
single image. We utilize global 2D features to predict a coarse shape of the
target object and then use it as a base for higher-resolution reconstruction.
By leveraging both local 2D features from the image and 3D features from the
coarse prediction, we can predict the signed distance between an arbitrary
point and the target surface via an implicit predictor with great accuracy.
Furthermore, our model does not require camera estimation or pixel alignment.
It provides an uninfluenced reconstruction from the input-view direction.
Through qualitative and quantitative analysis, we show the superiority of our
model in reconstructing 3D objects from both synthetic and real-world images
against the state of the art.
Related papers
- GeoGS3D: Single-view 3D Reconstruction via Geometric-aware Diffusion Model and Gaussian Splatting [81.03553265684184]
We introduce GeoGS3D, a framework for reconstructing detailed 3D objects from single-view images.
We propose a novel metric, Gaussian Divergence Significance (GDS), to prune unnecessary operations during optimization.
Experiments demonstrate that GeoGS3D generates images with high consistency across views and reconstructs high-quality 3D objects.
arXiv Detail & Related papers (2024-03-15T12:24:36Z) - 3D Surface Reconstruction in the Wild by Deforming Shape Priors from
Synthetic Data [24.97027425606138]
Reconstructing the underlying 3D surface of an object from a single image is a challenging problem.
We present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image.
Our approach achieves state-of-the-art reconstruction performance across several real-world datasets.
arXiv Detail & Related papers (2023-02-24T20:37:27Z) - Single-view 3D Mesh Reconstruction for Seen and Unseen Categories [69.29406107513621]
Single-view 3D Mesh Reconstruction is a fundamental computer vision task that aims at recovering 3D shapes from single-view RGB images.
This paper tackles Single-view 3D Mesh Reconstruction, to study the model generalization on unseen categories.
We propose an end-to-end two-stage network, GenMesh, to break the category boundaries in reconstruction.
arXiv Detail & Related papers (2022-08-04T14:13:35Z) - 3D Magic Mirror: Clothing Reconstruction from a Single Image via a
Causal Perspective [96.65476492200648]
This research aims to study a self-supervised 3D clothing reconstruction method.
It recovers the geometry shape, and texture of human clothing from a single 2D image.
arXiv Detail & Related papers (2022-04-27T17:46:55Z) - Capturing Shape Information with Multi-Scale Topological Loss Terms for
3D Reconstruction [7.323706635751351]
We propose to complement geometrical shape information by including multi-scale topological features, such as connected components, cycles, and voids, in the reconstruction loss.
Our method calculates topological features from 3D volumetric data based on cubical complexes and uses an optimal transport distance to guide the reconstruction process.
We demonstrate the utility of our loss by incorporating it into SHAPR, a model for predicting the 3D cell shape of individual cells based on 2D microscopy images.
arXiv Detail & Related papers (2022-03-03T13:18:21Z) - Learnable Triangulation for Deep Learning-based 3D Reconstruction of
Objects of Arbitrary Topology from Single RGB Images [12.693545159861857]
We propose a novel deep reinforcement learning-based approach for 3D object reconstruction from monocular images.
The proposed method outperforms the state-of-the-art in terms of visual quality, reconstruction accuracy, and computational time.
arXiv Detail & Related papers (2021-09-24T09:44:22Z) - Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection [70.71934539556916]
We learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection.
Specifically, a principled geometry formula with projective modeling of 2D and 3D depth predictions in the monocular 3D object detection network is devised.
Our method remarkably improves the detection performance of the state-of-the-art monocular-based method without extra data by 2.80% on the moderate test setting.
arXiv Detail & Related papers (2021-07-29T12:30:39Z) - Hybrid Approach for 3D Head Reconstruction: Using Neural Networks and
Visual Geometry [3.970492757288025]
We present a novel method for reconstructing 3D heads from a single or multiple image(s) using a hybrid approach based on deep learning and geometric techniques.
We propose an encoder-decoder network based on the U-net architecture and trained on synthetic data only.
arXiv Detail & Related papers (2021-04-28T11:31:35Z) - Fully Understanding Generic Objects: Modeling, Segmentation, and
Reconstruction [33.95791350070165]
Inferring 3D structure of a generic object from a 2D image is a long-standing objective of computer vision.
We take an alternative approach with semi-supervised learning. That is, for a 2D image of a generic object, we decompose it into latent representations of category, shape and albedo.
We show that the complete shape and albedo modeling enables us to leverage real 2D images in both modeling and model fitting.
arXiv Detail & Related papers (2021-04-02T02:39:29Z) - Canonical 3D Deformer Maps: Unifying parametric and non-parametric
methods for dense weakly-supervised category reconstruction [79.98689027127855]
We propose a new representation of the 3D shape of common object categories that can be learned from a collection of 2D images of independent objects.
Our method builds in a novel way on concepts from parametric deformation models, non-parametric 3D reconstruction, and canonical embeddings.
It achieves state-of-the-art results in dense 3D reconstruction on public in-the-wild datasets of faces, cars, and birds.
arXiv Detail & Related papers (2020-08-28T15:44:05Z) - Learning Unsupervised Hierarchical Part Decomposition of 3D Objects from
a Single RGB Image [102.44347847154867]
We propose a novel formulation that allows to jointly recover the geometry of a 3D object as a set of primitives.
Our model recovers the higher level structural decomposition of various objects in the form of a binary tree of primitives.
Our experiments on the ShapeNet and D-FAUST datasets demonstrate that considering the organization of parts indeed facilitates reasoning about 3D geometry.
arXiv Detail & Related papers (2020-04-02T17:58:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.