D$^2$IM-Net: Learning Detail Disentangled Implicit Fields from Single
Images
- URL: http://arxiv.org/abs/2012.06650v2
- Date: Thu, 17 Dec 2020 13:16:00 GMT
- Title: D$^2$IM-Net: Learning Detail Disentangled Implicit Fields from Single
Images
- Authors: Manyi Li, Hao Zhang
- Abstract summary: We present the first single-view 3D reconstruction network aimed at recovering geometric details from an input image.
Our key idea is to train the network to learn a detail disentangled reconstruction consisting of two functions.
- Score: 6.121310352120004
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present the first single-view 3D reconstruction network aimed at
recovering geometric details from an input image which encompass both
topological shape structures and surface features. Our key idea is to train the
network to learn a detail disentangled reconstruction consisting of two
functions, one implicit field representing the coarse 3D shape and the other
capturing the details. Given an input image, our network, coined D$^2$IM-Net,
encodes it into global and local features which are respectively fed into two
decoders. The base decoder uses the global features to reconstruct a coarse
implicit field, while the detail decoder reconstructs, from the local features,
two displacement maps, defined over the front and back sides of the captured
object. The final 3D reconstruction is a fusion between the base shape and the
displacement maps, with three losses enforcing the recovery of coarse shape,
overall structure, and surface details via a novel Laplacian term.
Related papers
- T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single Image [84.08705684778666]
We propose a novel Transformer-boosted architecture, named T-Pixel2Mesh, inspired by the coarse-to-fine approach of P2M.
Specifically, we use a global Transformer to control the holistic shape and a local Transformer to refine the local geometry details.
Our experiments on ShapeNet demonstrate state-of-the-art performance, while results on real-world data show the generalization capability.
arXiv Detail & Related papers (2024-03-20T15:14:22Z) - LIST: Learning Implicitly from Spatial Transformers for Single-View 3D
Reconstruction [5.107705550575662]
List is a novel neural architecture that leverages local and global image features to reconstruct geometric and topological structure of a 3D object from a single image.
We show the superiority of our model in reconstructing 3D objects from both synthetic and real-world images against the state of the art.
arXiv Detail & Related papers (2023-07-23T01:01:27Z) - Pixel-Aligned Non-parametric Hand Mesh Reconstruction [16.62199923065314]
Non-parametric mesh reconstruction has recently shown significant progress in 3D hand and body applications.
In this paper, we seek to establish and exploit this mapping with a simple and compact architecture.
We propose an end-to-end pipeline for hand mesh recovery tasks which consists of three phases.
arXiv Detail & Related papers (2022-10-17T15:53:18Z) - Single-view 3D Mesh Reconstruction for Seen and Unseen Categories [69.29406107513621]
Single-view 3D Mesh Reconstruction is a fundamental computer vision task that aims at recovering 3D shapes from single-view RGB images.
This paper tackles Single-view 3D Mesh Reconstruction, to study the model generalization on unseen categories.
We propose an end-to-end two-stage network, GenMesh, to break the category boundaries in reconstruction.
arXiv Detail & Related papers (2022-08-04T14:13:35Z) - 3D Shape Reconstruction from 2D Images with Disentangled Attribute Flow [61.62796058294777]
Reconstructing 3D shape from a single 2D image is a challenging task.
Most of the previous methods still struggle to extract semantic attributes for 3D reconstruction task.
We propose 3DAttriFlow to disentangle and extract semantic attributes through different semantic levels in the input images.
arXiv Detail & Related papers (2022-03-29T02:03:31Z) - Capturing Shape Information with Multi-Scale Topological Loss Terms for
3D Reconstruction [7.323706635751351]
We propose to complement geometrical shape information by including multi-scale topological features, such as connected components, cycles, and voids, in the reconstruction loss.
Our method calculates topological features from 3D volumetric data based on cubical complexes and uses an optimal transport distance to guide the reconstruction process.
We demonstrate the utility of our loss by incorporating it into SHAPR, a model for predicting the 3D cell shape of individual cells based on 2D microscopy images.
arXiv Detail & Related papers (2022-03-03T13:18:21Z) - Learning Geometry-Disentangled Representation for Complementary
Understanding of 3D Object Point Cloud [50.56461318879761]
We propose Geometry-Disentangled Attention Network (GDANet) for 3D image processing.
GDANet disentangles point clouds into contour and flat part of 3D objects, respectively denoted by sharp and gentle variation components.
Experiments on 3D object classification and segmentation benchmarks demonstrate that GDANet achieves the state-of-the-arts with fewer parameters.
arXiv Detail & Related papers (2020-12-20T13:35:00Z) - STD-Net: Structure-preserving and Topology-adaptive Deformation Network
for 3D Reconstruction from a Single Image [27.885717341244014]
3D reconstruction from a single view image is a long-standing prob-lem in computer vision.
In this paper, we propose a novel methodcalled STD-Net to reconstruct the 3D models utilizing the mesh representation.
Experimental results on the images from ShapeNet show that ourproposed STD-Net has better performance than other state-of-the-art methods onreconstructing 3D objects.
arXiv Detail & Related papers (2020-03-07T11:02:47Z) - Implicit Functions in Feature Space for 3D Shape Reconstruction and
Completion [53.885984328273686]
Implicit Feature Networks (IF-Nets) deliver continuous outputs, can handle multiple topologies, and complete shapes for missing or sparse input data.
IF-Nets clearly outperform prior work in 3D object reconstruction in ShapeNet, and obtain significantly more accurate 3D human reconstructions.
arXiv Detail & Related papers (2020-03-03T11:14:29Z) - Learning 3D Human Shape and Pose from Dense Body Parts [117.46290013548533]
We propose a Decompose-and-aggregate Network (DaNet) to learn 3D human shape and pose from dense correspondences of body parts.
Messages from local streams are aggregated to enhance the robust prediction of the rotation-based poses.
Our method is validated on both indoor and real-world datasets including Human3.6M, UP3D, COCO, and 3DPW.
arXiv Detail & Related papers (2019-12-31T15:09:51Z) - DISN: Deep Implicit Surface Network for High-quality Single-view 3D Reconstruction [24.903382114775283]
Reconstructing 3D shapes from single-view images has been a long-standing research problem.
We present DISN, a Deep Implicit Surface Network which can generate a high-quality detail-rich 3D mesh from an 2D image.
To the best of our knowledge, DISN is the first method that constantly captures details such as holes and thin structures present in 3D shapes from single-view images.
arXiv Detail & Related papers (2019-05-26T01:58:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.