Leveraging Monocular Disparity Estimation for Single-View Reconstruction
- URL: http://arxiv.org/abs/2207.00182v1
- Date: Fri, 1 Jul 2022 03:05:40 GMT
- Title: Leveraging Monocular Disparity Estimation for Single-View Reconstruction
- Authors: Marissa Ramirez de Chanlatte, Matheus Gadelha, Thibault Groueix,
Radomir Mech
- Abstract summary: We leverage advances in monocular depth estimation to obtain disparity maps.
We transform 2D normalized disparity maps into 3D point clouds by solving an optimization on the relevant camera parameters.
- Score: 8.583436410810203
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a fine-tuning method to improve the appearance of 3D geometries
reconstructed from single images. We leverage advances in monocular depth
estimation to obtain disparity maps and present a novel approach to
transforming 2D normalized disparity maps into 3D point clouds by solving an
optimization on the relevant camera parameters, After creating a 3D point cloud
from disparity, we introduce a method to combine the new point cloud with
existing information to form a more faithful and detailed final geometry. We
demonstrate the efficacy of our approach with multiple experiments on both
synthetic and real images.
Related papers
- PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting [54.7468067660037]
PF3plat sets a new state-of-the-art across all benchmarks, supported by comprehensive ablation studies validating our design choices.
Our framework capitalizes on fast speed, scalability, and high-quality 3D reconstruction and view synthesis capabilities of 3DGS.
arXiv Detail & Related papers (2024-10-29T15:28:15Z) - LAM3D: Large Image-Point-Cloud Alignment Model for 3D Reconstruction from Single Image [64.94932577552458]
Large Reconstruction Models have made significant strides in the realm of automated 3D content generation from single or multiple input images.
Despite their success, these models often produce 3D meshes with geometric inaccuracies, stemming from the inherent challenges of deducing 3D shapes solely from image data.
We introduce a novel framework, the Large Image and Point Cloud Alignment Model (LAM3D), which utilizes 3D point cloud data to enhance the fidelity of generated 3D meshes.
arXiv Detail & Related papers (2024-05-24T15:09:12Z) - GeoGS3D: Single-view 3D Reconstruction via Geometric-aware Diffusion Model and Gaussian Splatting [81.03553265684184]
We introduce GeoGS3D, a framework for reconstructing detailed 3D objects from single-view images.
We propose a novel metric, Gaussian Divergence Significance (GDS), to prune unnecessary operations during optimization.
Experiments demonstrate that GeoGS3D generates images with high consistency across views and reconstructs high-quality 3D objects.
arXiv Detail & Related papers (2024-03-15T12:24:36Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape
Optimization [30.951405623906258]
Single image 3D reconstruction is an important but challenging task that requires extensive knowledge of our natural world.
We propose a novel method that takes a single image of any object as input and generates a full 360-degree 3D textured mesh in a single feed-forward pass.
arXiv Detail & Related papers (2023-06-29T13:28:16Z) - Learning to Generate 3D Representations of Building Roofs Using
Single-View Aerial Imagery [68.3565370706598]
We present a novel pipeline for learning the conditional distribution of a building roof mesh given pixels from an aerial image.
Unlike alternative methods that require multiple images of the same object, our approach enables estimating 3D roof meshes using only a single image for predictions.
arXiv Detail & Related papers (2023-03-20T15:47:05Z) - Flow-based GAN for 3D Point Cloud Generation from a Single Image [16.04710129379503]
We introduce a hybrid explicit-implicit generative modeling scheme, which inherits the flow-based explicit generative models for sampling point clouds with arbitrary resolutions.
We evaluate on the large-scale synthetic dataset ShapeNet, with the experimental results demonstrating the superior performance of the proposed method.
arXiv Detail & Related papers (2022-10-08T17:58:20Z) - Learning A Locally Unified 3D Point Cloud for View Synthesis [45.757280092357355]
We propose a new deep learning-based view synthesis paradigm that learns a locally unified 3D point cloud from source views.
Experimental results on three benchmark datasets demonstrate that our method can improve the average PSNR by more than 4 dB.
arXiv Detail & Related papers (2022-09-12T04:07:34Z) - Learning Stereopsis from Geometric Synthesis for 6D Object Pose
Estimation [11.999630902627864]
Current monocular-based 6D object pose estimation methods generally achieve less competitive results than RGBD-based methods.
This paper proposes a 3D geometric volume based pose estimation method with a short baseline two-view setting.
Experiments show that our method outperforms state-of-the-art monocular-based methods, and is robust in different objects and scenes.
arXiv Detail & Related papers (2021-09-25T02:55:05Z) - Ladybird: Quasi-Monte Carlo Sampling for Deep Implicit Field Based 3D
Reconstruction with Symmetry [12.511526058118143]
We propose a sampling scheme that theoretically encourages generalization and results in fast convergence for SGD-based optimization algorithms.
Based on the reflective symmetry of an object, we propose a feature fusion method that alleviates issues due to self-occlusions.
Our proposed system Ladybird is able to create high quality 3D object reconstructions from a single input image.
arXiv Detail & Related papers (2020-07-27T09:17:00Z) - Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled
Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras.
We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points.
Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.