Implicit Functions in Feature Space for 3D Shape Reconstruction and
Completion
- URL: http://arxiv.org/abs/2003.01456v2
- Date: Wed, 15 Apr 2020 14:47:27 GMT
- Title: Implicit Functions in Feature Space for 3D Shape Reconstruction and
Completion
- Authors: Julian Chibane, Thiemo Alldieck, Gerard Pons-Moll
- Abstract summary: Implicit Feature Networks (IF-Nets) deliver continuous outputs, can handle multiple topologies, and complete shapes for missing or sparse input data.
IF-Nets clearly outperform prior work in 3D object reconstruction in ShapeNet, and obtain significantly more accurate 3D human reconstructions.
- Score: 53.885984328273686
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While many works focus on 3D reconstruction from images, in this paper, we
focus on 3D shape reconstruction and completion from a variety of 3D inputs,
which are deficient in some respect: low and high resolution voxels, sparse and
dense point clouds, complete or incomplete. Processing of such 3D inputs is an
increasingly important problem as they are the output of 3D scanners, which are
becoming more accessible, and are the intermediate output of 3D computer vision
algorithms. Recently, learned implicit functions have shown great promise as
they produce continuous reconstructions. However, we identified two limitations
in reconstruction from 3D inputs: 1) details present in the input data are not
retained, and 2) poor reconstruction of articulated humans. To solve this, we
propose Implicit Feature Networks (IF-Nets), which deliver continuous outputs,
can handle multiple topologies, and complete shapes for missing or sparse input
data retaining the nice properties of recent learned implicit functions, but
critically they can also retain detail when it is present in the input data,
and can reconstruct articulated humans. Our work differs from prior work in two
crucial aspects. First, instead of using a single vector to encode a 3D shape,
we extract a learnable 3-dimensional multi-scale tensor of deep features, which
is aligned with the original Euclidean space embedding the shape. Second,
instead of classifying x-y-z point coordinates directly, we classify deep
features extracted from the tensor at a continuous query point. We show that
this forces our model to make decisions based on global and local shape
structure, as opposed to point coordinates, which are arbitrary under Euclidean
transformations. Experiments demonstrate that IF-Nets clearly outperform prior
work in 3D object reconstruction in ShapeNet, and obtain significantly more
accurate 3D human reconstructions.
Related papers
- Multi-View Representation is What You Need for Point-Cloud Pre-Training [22.55455166875263]
This paper proposes a novel approach to point-cloud pre-training that learns 3D representations by leveraging pre-trained 2D networks.
We train the 3D feature extraction network with the help of the novel 2D knowledge transfer loss.
Experimental results demonstrate that our pre-trained model can be successfully transferred to various downstream tasks.
arXiv Detail & Related papers (2023-06-05T03:14:54Z) - Sampling is Matter: Point-guided 3D Human Mesh Reconstruction [0.0]
This paper presents a simple yet powerful method for 3D human mesh reconstruction from a single RGB image.
Experimental results on benchmark datasets show that the proposed method efficiently improves the performance of 3D human mesh reconstruction.
arXiv Detail & Related papers (2023-04-19T08:45:26Z) - 3D Surface Reconstruction in the Wild by Deforming Shape Priors from
Synthetic Data [24.97027425606138]
Reconstructing the underlying 3D surface of an object from a single image is a challenging problem.
We present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image.
Our approach achieves state-of-the-art reconstruction performance across several real-world datasets.
arXiv Detail & Related papers (2023-02-24T20:37:27Z) - SNAKE: Shape-aware Neural 3D Keypoint Field [62.91169625183118]
Detecting 3D keypoints from point clouds is important for shape reconstruction.
This work investigates the dual question: can shape reconstruction benefit 3D keypoint detection?
We propose a novel unsupervised paradigm named SNAKE, which is short for shape-aware neural 3D keypoint field.
arXiv Detail & Related papers (2022-06-03T17:58:43Z) - 3D Shape Reconstruction from 2D Images with Disentangled Attribute Flow [61.62796058294777]
Reconstructing 3D shape from a single 2D image is a challenging task.
Most of the previous methods still struggle to extract semantic attributes for 3D reconstruction task.
We propose 3DAttriFlow to disentangle and extract semantic attributes through different semantic levels in the input images.
arXiv Detail & Related papers (2022-03-29T02:03:31Z) - Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic
Segmentation [87.54570024320354]
State-of-the-art methods for large-scale driving-scene LiDAR semantic segmentation often project and process the point clouds in the 2D space.
A straightforward solution to tackle the issue of 3D-to-2D projection is to keep the 3D representation and process the points in the 3D space.
We develop a 3D cylinder partition and a 3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds.
arXiv Detail & Related papers (2020-08-04T13:56:19Z) - KAPLAN: A 3D Point Descriptor for Shape Completion [80.15764700137383]
KAPLAN is a 3D point descriptor that aggregates local shape information via a series of 2D convolutions.
In each of those planes, point properties like normals or point-to-plane distances are aggregated into a 2D grid and abstracted into a feature representation with an efficient 2D convolutional encoder.
Experiments on public datasets show that KAPLAN achieves state-of-the-art performance for 3D shape completion.
arXiv Detail & Related papers (2020-07-31T21:56:08Z) - Pix2Vox++: Multi-scale Context-aware 3D Object Reconstruction from
Single and Multiple Images [56.652027072552606]
We propose a novel framework for single-view and multi-view 3D object reconstruction, named Pix2Vox++.
By using a well-designed encoder-decoder, it generates a coarse 3D volume from each input image.
A multi-scale context-aware fusion module is then introduced to adaptively select high-quality reconstructions for different parts from all coarse 3D volumes to obtain a fused 3D volume.
arXiv Detail & Related papers (2020-06-22T13:48:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.