DmifNet:3D Shape Reconstruction Based on Dynamic Multi-Branch
Information Fusion
- URL: http://arxiv.org/abs/2011.10776v1
- Date: Sat, 21 Nov 2020 11:31:27 GMT
- Title: DmifNet:3D Shape Reconstruction Based on Dynamic Multi-Branch
Information Fusion
- Authors: Lei Li, Suping Wu
- Abstract summary: 3D object reconstruction from a single-view image is a long-standing challenging problem.
Previous work was difficult to accurately reconstruct 3D shapes with a complex topology which has rich details at the edges and corners.
We propose a Dynamic Multi-branch Information Fusion Network (DmifNet) which can recover a high-fidelity 3D shape of arbitrary topology from a 2D image.
- Score: 14.585272577456472
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: 3D object reconstruction from a single-view image is a long-standing
challenging problem. Previous work was difficult to accurately reconstruct 3D
shapes with a complex topology which has rich details at the edges and corners.
Moreover, previous works used synthetic data to train their network, but domain
adaptation problems occurred when tested on real data. In this paper, we
propose a Dynamic Multi-branch Information Fusion Network (DmifNet) which can
recover a high-fidelity 3D shape of arbitrary topology from a 2D image.
Specifically, we design several side branches from the intermediate layers to
make the network produce more diverse representations to improve the
generalization ability of network. In addition, we utilize DoG (Difference of
Gaussians) to extract edge geometry and corners information from input images.
Then, we use a separate side branch network to process the extracted data to
better capture edge geometry and corners feature information. Finally, we
dynamically fuse the information of all branches to gain final predicted
probability. Extensive qualitative and quantitative experiments on a
large-scale publicly available dataset demonstrate the validity and efficiency
of our method. Code and models are publicly available at
https://github.com/leilimaster/DmifNet.
Related papers
- Inverse Neural Rendering for Explainable Multi-Object Tracking [35.072142773300655]
We recast 3D multi-object tracking from RGB cameras as an emphInverse Rendering (IR) problem.
We optimize an image loss over generative latent spaces that inherently disentangle shape and appearance properties.
We validate the generalization and scaling capabilities of our method by learning the generative prior exclusively from synthetic data.
arXiv Detail & Related papers (2024-04-18T17:37:53Z) - PointMCD: Boosting Deep Point Cloud Encoders via Multi-view Cross-modal
Distillation for 3D Shape Recognition [55.38462937452363]
We propose a unified multi-view cross-modal distillation architecture, including a pretrained deep image encoder as the teacher and a deep point encoder as the student.
By pair-wise aligning multi-view visual and geometric descriptors, we can obtain more powerful deep point encoders without exhausting and complicated network modification.
arXiv Detail & Related papers (2022-07-07T07:23:20Z) - Learning Geometry-Disentangled Representation for Complementary
Understanding of 3D Object Point Cloud [50.56461318879761]
We propose Geometry-Disentangled Attention Network (GDANet) for 3D image processing.
GDANet disentangles point clouds into contour and flat part of 3D objects, respectively denoted by sharp and gentle variation components.
Experiments on 3D object classification and segmentation benchmarks demonstrate that GDANet achieves the state-of-the-arts with fewer parameters.
arXiv Detail & Related papers (2020-12-20T13:35:00Z) - DECOR-GAN: 3D Shape Detailization by Conditional Refinement [50.8801457082181]
We introduce a deep generative network for 3D shape detailization, akin to stylization with the style being geometric details.
We demonstrate that our method can refine a coarse shape into a variety of detailed shapes with different styles.
arXiv Detail & Related papers (2020-12-16T18:52:10Z) - MeshMVS: Multi-View Stereo Guided Mesh Reconstruction [35.763452474239955]
Deep learning based 3D shape generation methods generally utilize latent features extracted from color images to encode the semantics of objects.
We propose a multi-view mesh generation method which incorporates geometry information explicitly by using the features from intermediate depth representations of multi-view stereo.
We achieve superior results than state-of-the-art multi-view shape generation methods with 34% decrease in Chamfer distance to ground truth and 14% increase in F1-score on ShapeNet dataset.
arXiv Detail & Related papers (2020-10-17T00:51:21Z) - Improving Deep Stereo Network Generalization with Geometric Priors [93.09496073476275]
Large datasets of diverse real-world scenes with dense ground truth are difficult to obtain.
Many algorithms rely on small real-world datasets of similar scenes or synthetic datasets.
We propose to incorporate prior knowledge of scene geometry into an end-to-end stereo network to help networks generalize better.
arXiv Detail & Related papers (2020-08-25T15:24:02Z) - Pix2Vox++: Multi-scale Context-aware 3D Object Reconstruction from
Single and Multiple Images [56.652027072552606]
We propose a novel framework for single-view and multi-view 3D object reconstruction, named Pix2Vox++.
By using a well-designed encoder-decoder, it generates a coarse 3D volume from each input image.
A multi-scale context-aware fusion module is then introduced to adaptively select high-quality reconstructions for different parts from all coarse 3D volumes to obtain a fused 3D volume.
arXiv Detail & Related papers (2020-06-22T13:48:09Z) - Stereo RGB and Deeper LIDAR Based Network for 3D Object Detection [40.34710686994996]
3D object detection has become an emerging task in autonomous driving scenarios.
Previous works process 3D point clouds using either projection-based or voxel-based models.
We propose the Stereo RGB and Deeper LIDAR framework which can utilize semantic and spatial information simultaneously.
arXiv Detail & Related papers (2020-06-09T11:19:24Z) - Implicit Functions in Feature Space for 3D Shape Reconstruction and
Completion [53.885984328273686]
Implicit Feature Networks (IF-Nets) deliver continuous outputs, can handle multiple topologies, and complete shapes for missing or sparse input data.
IF-Nets clearly outperform prior work in 3D object reconstruction in ShapeNet, and obtain significantly more accurate 3D human reconstructions.
arXiv Detail & Related papers (2020-03-03T11:14:29Z) - 6D Object Pose Regression via Supervised Learning on Point Clouds [42.21181542960924]
This paper addresses the task of estimating the 6 degrees of freedom pose of a known 3D object from depth information represented by a point cloud.
We use depth information represented by point clouds as the input to both deep networks and geometry-based pose refinement.
Our simple yet effective approach clearly outperforms state-of-the-art methods on the YCB-video dataset.
arXiv Detail & Related papers (2020-01-24T10:29:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.