M3D: Dual-Stream Selective State Spaces and Depth-Driven Framework for High-Fidelity Single-View 3D Reconstruction
- URL: http://arxiv.org/abs/2411.12635v2
- Date: Wed, 20 Nov 2024 12:54:52 GMT
- Title: M3D: Dual-Stream Selective State Spaces and Depth-Driven Framework for High-Fidelity Single-View 3D Reconstruction
- Authors: Luoxi Zhang, Pragyan Shrestha, Yu Zhou, Chun Xie, Itaru Kitahara,
- Abstract summary: M3D is a novel single-view 3D reconstruction framework for complex scenes.
It balances the extraction of global and local features, thereby improving scene comprehension and representation precision.
Results indicate that the fusion of multi-scale features with depth information via the dual-branch feature extraction significantly boosts geometric consistency and fidelity.
- Score: 3.2228041579285978
- License:
- Abstract: The precise reconstruction of 3D objects from a single RGB image in complex scenes presents a critical challenge in virtual reality, autonomous driving, and robotics. Existing neural implicit 3D representation methods face significant difficulties in balancing the extraction of global and local features, particularly in diverse and complex environments, leading to insufficient reconstruction precision and quality. We propose M3D, a novel single-view 3D reconstruction framework, to tackle these challenges. This framework adopts a dual-stream feature extraction strategy based on Selective State Spaces to effectively balance the extraction of global and local features, thereby improving scene comprehension and representation precision. Additionally, a parallel branch extracts depth information, effectively integrating visual and geometric features to enhance reconstruction quality and preserve intricate details. Experimental results indicate that the fusion of multi-scale features with depth information via the dual-branch feature extraction significantly boosts geometric consistency and fidelity, achieving state-of-the-art reconstruction performance.
Related papers
- Direct and Explicit 3D Generation from a Single Image [25.207277983430608]
We introduce a novel framework to directly generate explicit surface geometry and texture using multi-view 2D depth and RGB images.
We incorporate epipolar attention into the latent-to-pixel decoder for pixel-level multi-view consistency.
By back-projecting the generated depth pixels into 3D space, we create a structured 3D representation.
arXiv Detail & Related papers (2024-11-17T03:14:50Z) - Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment [23.733856513456]
We present REPARO, a novel approach for compositional 3D asset generation from single images.
REPARO employs a two-step process: first, it extracts individual objects from the scene and reconstructs their 3D meshes using off-the-shelf image-to-3D models.
It then optimize the layout of these meshes through differentiable rendering techniques, ensuring coherent scene composition.
arXiv Detail & Related papers (2024-05-28T18:45:10Z) - GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision [49.839374549646884]
This paper presents GEOcc, a Geometric-Enhanced Occupancy network tailored for vision-only surround-view perception.
Our approach achieves State-Of-The-Art performance on the Occ3D-nuScenes dataset with the least image resolution needed and the most weightless image backbone.
arXiv Detail & Related papers (2024-05-17T07:31:20Z) - 3DFIRES: Few Image 3D REconstruction for Scenes with Hidden Surface [8.824340350342512]
3DFIRES is a novel system for scene-level 3D reconstruction from posed images.
We show it matches the efficacy of single-view reconstruction methods with only one input.
arXiv Detail & Related papers (2024-03-13T17:59:50Z) - UniSDF: Unifying Neural Representations for High-Fidelity 3D
Reconstruction of Complex Scenes with Reflections [92.38975002642455]
We propose UniSDF, a general purpose 3D reconstruction method that can reconstruct large complex scenes with reflections.
Our method is able to robustly reconstruct complex large-scale scenes with fine details and reflective surfaces.
arXiv Detail & Related papers (2023-12-20T18:59:42Z) - Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture [47.44029968307207]
We propose a novel framework for simultaneous high-fidelity recovery of object shapes and textures from single-view images.
Our approach utilizes the proposed Single-view neural implicit Shape and Radiance field (SSR) representations to leverage both explicit 3D shape supervision and volume rendering.
A distinctive feature of our framework is its ability to generate fine-grained textured meshes while seamlessly integrating rendering capabilities into the single-view 3D reconstruction model.
arXiv Detail & Related papers (2023-11-01T11:46:15Z) - R3D3: Dense 3D Reconstruction of Dynamic Scenes from Multiple Cameras [106.52409577316389]
R3D3 is a multi-camera system for dense 3D reconstruction and ego-motion estimation.
Our approach exploits spatial-temporal information from multiple cameras, and monocular depth refinement.
We show that this design enables a dense, consistent 3D reconstruction of challenging, dynamic outdoor environments.
arXiv Detail & Related papers (2023-08-28T17:13:49Z) - High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views.
Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.