PlankAssembly: Robust 3D Reconstruction from Three Orthographic Views
with Learnt Shape Programs
- URL: http://arxiv.org/abs/2308.05744v1
- Date: Thu, 10 Aug 2023 17:59:34 GMT
- Title: PlankAssembly: Robust 3D Reconstruction from Three Orthographic Views
with Learnt Shape Programs
- Authors: Wentao Hu and Jia Zheng and Zixin Zhang and Xiaojun Yuan and Jian Yin
and Zihan Zhou
- Abstract summary: We develop a new method to automatically convert 2D line drawings from three orthographic views into 3D CAD models.
We leverage the attention mechanism in a Transformer-based sequence generation model to learn flexible mappings between the input and output.
Our method significantly outperforms existing ones when the inputs are noisy or incomplete.
- Score: 24.09764733540401
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we develop a new method to automatically convert 2D line
drawings from three orthographic views into 3D CAD models. Existing methods for
this problem reconstruct 3D models by back-projecting the 2D observations into
3D space while maintaining explicit correspondence between the input and
output. Such methods are sensitive to errors and noises in the input, thus
often fail in practice where the input drawings created by human designers are
imperfect. To overcome this difficulty, we leverage the attention mechanism in
a Transformer-based sequence generation model to learn flexible mappings
between the input and output. Further, we design shape programs which are
suitable for generating the objects of interest to boost the reconstruction
accuracy and facilitate CAD modeling applications. Experiments on a new
benchmark dataset show that our method significantly outperforms existing ones
when the inputs are noisy or incomplete.
Related papers
- GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction [52.04103235260539]
We present a diffusion model approach based on Gaussian Splatting representation for 3D object reconstruction from a single view.
The model learns to generate 3D objects represented by sets of GS ellipsoids.
The final reconstructed objects explicitly come with high-quality 3D structure and texture, and can be efficiently rendered in arbitrary views.
arXiv Detail & Related papers (2024-07-05T03:43:08Z) - Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling [14.341099905684844]
This paper investigates a 2D to 3D image translation method with a straightforward technique, enabling correlated 2D X-ray to 3D CT-like reconstruction.
We observe that existing approaches, which integrate information across multiple 2D views in the latent space lose valuable signal information during latent encoding. Instead, we simply repeat and the 2D views into higher-channel 3D volumes and approach the 3D reconstruction challenge as a straightforward 3D to 3D generative modeling problem.
This method enables the reconstructed 3D volume to retain valuable information from the 2D inputs, which are passed between channel states in a Swin U
arXiv Detail & Related papers (2024-06-26T15:18:20Z) - Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior [57.986512832738704]
We present a new framework Sculpt3D that equips the current pipeline with explicit injection of 3D priors from retrieved reference objects without re-training the 2D diffusion model.
Specifically, we demonstrate that high-quality and diverse 3D geometry can be guaranteed by keypoints supervision through a sparse ray sampling approach.
These two decoupled designs effectively harness 3D information from reference objects to generate 3D objects while preserving the generation quality of the 2D diffusion model.
arXiv Detail & Related papers (2024-03-14T07:39:59Z) - Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion [115.82306502822412]
StyleGAN has achieved great progress in 2D face reconstruction and semantic editing via image inversion and latent editing.
A corresponding generic 3D GAN inversion framework is still missing, limiting the applications of 3D face reconstruction and semantic editing.
We study the challenging problem of 3D GAN inversion where a latent code is predicted given a single face image to faithfully recover its 3D shapes and detailed textures.
arXiv Detail & Related papers (2022-12-14T18:49:50Z) - 3D-C2FT: Coarse-to-fine Transformer for Multi-view 3D Reconstruction [14.89364490991374]
This paper proposes a new model, namely 3D coarse-to-fine transformer (3D-C2FT), for encoding multi-view features and rectifying defective 3D objects.
C2F attention mechanism enables the model to learn multi-view information flow and synthesize 3D surface correction in a coarse to fine-grained manner.
Experimental results show that 3D-C2FT achieves notable results and outperforms several competing models on these datasets.
arXiv Detail & Related papers (2022-05-29T06:01:42Z) - RiCS: A 2D Self-Occlusion Map for Harmonizing Volumetric Objects [68.85305626324694]
Ray-marching in Camera Space (RiCS) is a new method to represent the self-occlusions of foreground objects in 3D into a 2D self-occlusion map.
We show that our representation map not only allows us to enhance the image quality but also to model temporally coherent complex shadow effects.
arXiv Detail & Related papers (2022-05-14T05:35:35Z) - DProST: 6-DoF Object Pose Estimation Using Space Carving and Dynamic
Projective Spatial Transformer [20.291172201922084]
Most deep learning-based pose estimation methods require CAD data to use 3D intermediate representations or project 2D appearance.
We propose a new pose estimation system consisting of a space carving module that reconstructs a reference 3D feature to replace the CAD data.
Also, we overcome the self-occlusion problem by a new Bidirectional Z-buffering (BiZ-buffer) method, which extracts both the front view and the self-occluded back view of the object.
arXiv Detail & Related papers (2021-12-16T10:39:09Z) - Using Adaptive Gradient for Texture Learning in Single-View 3D
Reconstruction [0.0]
Learning-based approaches for 3D model reconstruction have attracted attention owing to its modern applications.
We present a novel sampling algorithm by optimizing the gradient of predicted coordinates based on the variance on the sampling image.
We also adopt Frechet Inception Distance (FID) to form a loss function in learning, which helps bridging the gap between rendered images and input images.
arXiv Detail & Related papers (2021-04-29T07:52:54Z) - Exemplar Fine-Tuning for 3D Human Model Fitting Towards In-the-Wild 3D
Human Pose Estimation [107.07047303858664]
Large-scale human datasets with 3D ground-truth annotations are difficult to obtain in the wild.
We address this problem by augmenting existing 2D datasets with high-quality 3D pose fits.
The resulting annotations are sufficient to train from scratch 3D pose regressor networks that outperform the current state-of-the-art on in-the-wild benchmarks.
arXiv Detail & Related papers (2020-04-07T20:21:18Z) - Implicit Functions in Feature Space for 3D Shape Reconstruction and
Completion [53.885984328273686]
Implicit Feature Networks (IF-Nets) deliver continuous outputs, can handle multiple topologies, and complete shapes for missing or sparse input data.
IF-Nets clearly outperform prior work in 3D object reconstruction in ShapeNet, and obtain significantly more accurate 3D human reconstructions.
arXiv Detail & Related papers (2020-03-03T11:14:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.