PE3R: Perception-Efficient 3D Reconstruction
- URL: http://arxiv.org/abs/2503.07507v1
- Date: Mon, 10 Mar 2025 16:29:10 GMT
- Title: PE3R: Perception-Efficient 3D Reconstruction
- Authors: Jie Hu, Shizun Wang, Xinchao Wang,
- Abstract summary: Perception-Efficient 3D Reconstruction (PE3R) is a novel framework designed to enhance both accuracy and efficiency.<n>The framework achieves a minimum 9-fold speedup in 3D semantic field reconstruction, along with substantial gains in perception accuracy and reconstruction precision.
- Score: 54.730257992806116
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in 2D-to-3D perception have significantly improved the understanding of 3D scenes from 2D images. However, existing methods face critical challenges, including limited generalization across scenes, suboptimal perception accuracy, and slow reconstruction speeds. To address these limitations, we propose Perception-Efficient 3D Reconstruction (PE3R), a novel framework designed to enhance both accuracy and efficiency. PE3R employs a feed-forward architecture to enable rapid 3D semantic field reconstruction. The framework demonstrates robust zero-shot generalization across diverse scenes and objects while significantly improving reconstruction speed. Extensive experiments on 2D-to-3D open-vocabulary segmentation and 3D reconstruction validate the effectiveness and versatility of PE3R. The framework achieves a minimum 9-fold speedup in 3D semantic field reconstruction, along with substantial gains in perception accuracy and reconstruction precision, setting new benchmarks in the field. The code is publicly available at: https://github.com/hujiecpp/PE3R.
Related papers
- Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness [73.72335146374543]
We introduce reconstructive visual instruction tuning with 3D-awareness (Ross3D), which integrates 3D-aware visual supervision into the training procedure.
Ross3D achieves state-of-the-art performance across various 3D scene understanding benchmarks.
arXiv Detail & Related papers (2025-04-02T16:59:55Z) - Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models [65.90387371072413]
We introduce Difix3D+, a novel pipeline designed to enhance 3D reconstruction and novel-view synthesis.<n>At the core of our approach is Difix, a single-step image diffusion model trained to enhance and remove artifacts in rendered novel views.
arXiv Detail & Related papers (2025-03-03T17:58:33Z) - Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass [68.78222900840132]
We propose Fast 3D Reconstruction (Fast3R), a novel multi-view generalization to DUSt3R that achieves efficient and scalable 3D reconstruction by processing many views in parallel.<n>Fast3R demonstrates state-of-the-art performance, with significant improvements in inference speed and reduced error accumulation.
arXiv Detail & Related papers (2025-01-23T18:59:55Z) - DuoLift-GAN:Reconstructing CT from Single-view and Biplanar X-Rays with Generative Adversarial Networks [1.3812010983144802]
We introduce DuoLift Generative Adversarial Networks (DuoLift-GAN), a novel architecture with dual branches that independently elevate 2D images and their features into 3D representations.<n>These 3D outputs are merged into a unified 3D feature map and decoded into a complete 3D chest volume, enabling richer 3D information capture.
arXiv Detail & Related papers (2024-11-12T17:11:18Z) - G3R: Gradient Guided Generalizable Reconstruction [39.198327570559684]
We introduce G3R, a generalizable reconstruction approach that can efficiently predict high-quality 3D scene representations for large scenes.
Experiments on urban-driving and drone datasets show that G3R generalizes across diverse large scenes and accelerates the reconstruction process by at least 10x.
arXiv Detail & Related papers (2024-09-28T16:54:16Z) - EPRecon: An Efficient Framework for Real-Time Panoptic 3D Reconstruction from Monocular Video [6.236130301507863]
We present EPRecon, an efficient real-time panoptic 3D reconstruction framework.
We propose a lightweight module to directly estimate scene depth priors in a 3D volume.
In addition, to infer richer panoptic features from occupied voxels, EPRecon extracts panoptic features from both voxel features and corresponding image features.
arXiv Detail & Related papers (2024-09-03T11:40:31Z) - Event3DGS: Event-Based 3D Gaussian Splatting for High-Speed Robot Egomotion [54.197343533492486]
Event3DGS can reconstruct high-fidelity 3D structure and appearance under high-speed egomotion.
Experiments on multiple synthetic and real-world datasets demonstrate the superiority of Event3DGS compared with existing event-based dense 3D scene reconstruction frameworks.
Our framework also allows one to incorporate a few motion-blurred frame-based measurements into the reconstruction process to further improve appearance fidelity without loss of structural accuracy.
arXiv Detail & Related papers (2024-06-05T06:06:03Z) - SCFusion: Real-time Incremental Scene Reconstruction with Semantic
Completion [86.77318031029404]
We propose a framework that performs scene reconstruction and semantic scene completion jointly in an incremental and real-time manner.
Our framework relies on a novel neural architecture designed to process occupancy maps and leverages voxel states to accurately and efficiently fuse semantic completion with the 3D global model.
arXiv Detail & Related papers (2020-10-26T15:31:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.