Related papers: Learning-based 3D Reconstruction in Autonomous Driving: A Comprehensive Survey

Learning-based 3D Reconstruction in Autonomous Driving: A Comprehensive Survey

URL: http://arxiv.org/abs/2503.14537v2
Date: Sat, 22 Mar 2025 02:26:04 GMT
Title: Learning-based 3D Reconstruction in Autonomous Driving: A Comprehensive Survey
Authors: Liewen Liao, Weihao Yan, Ming Yang, Songan Zhang,
Abstract summary: Learning-based 3D reconstruction has emerged as a transformative technique in autonomous driving.<n>We conduct a multi-perspective, in-depth analysis of recent advancements.
Score: 6.653873138644471
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning-based 3D reconstruction has emerged as a transformative technique in autonomous driving, enabling precise modeling of both dynamic and static environments through advanced neural representations. Despite data augmentation, 3D reconstruction inspires pioneering solution for vital tasks in the field of autonomous driving, such as scene understanding and closed-loop simulation. We investigates the details of 3D reconstruction and conducts a multi-perspective, in-depth analysis of recent advancements. Specifically, we first provide a systematic introduction of preliminaries, including data modalities, benchmarks and technical preliminaries of learning-based 3D reconstruction, facilitating instant identification of suitable methods according to sensor suites. Then, we systematically review learning-based 3D reconstruction methods in autonomous driving, categorizing approaches by subtasks and conducting multi-dimensional analysis and summary to establish a comprehensive technical reference. The development trends and existing challenges are summarized in the context of learning-based 3D reconstruction in autonomous driving. We hope that our review will inspire future researches.

Related papers

Advances in Feed-Forward 3D Reconstruction and View Synthesis: A Survey [154.50661618628433]
3D reconstruction and view synthesis are foundational problems in computer vision, graphics, and immersive technologies such as augmented reality (AR), virtual reality (VR), and digital twins.<n>Recent advances in feed-forward approaches, driven by deep learning, have revolutionized this field by enabling fast and generalizable 3D reconstruction and view synthesis.
arXiv Detail & Related papers (2025-07-19T06:13:25Z)
Review of Feed-forward 3D Reconstruction: From DUSt3R to VGGT [10.984522161856955]
3D reconstruction is a cornerstone technology for numerous applications, including augmented/virtual reality, autonomous driving, and robotics.<n>Deep learning has catalyzed a paradigm shift in 3D reconstruction.<n>New models employ a unified deep network to jointly infer camera poses and dense geometry directly from an Unconstrained set of images in a single forward pass.
arXiv Detail & Related papers (2025-07-11T09:41:54Z)
Spatial Understanding from Videos: Structured Prompts Meet Simulation Data [79.52833996220059]
We present a unified framework for enhancing 3D spatial reasoning in pre-trained vision-language models without modifying their architecture.<n>This framework combines SpatialMind, a structured prompting strategy that decomposes complex scenes and questions into interpretable reasoning steps, with ScanForgeQA, a scalable question-answering dataset built from diverse 3D simulation scenes.
arXiv Detail & Related papers (2025-06-04T07:36:33Z)
A Generative Approach to High Fidelity 3D Reconstruction from Text Data [0.0]
This research proposes a fully automated pipeline that seamlessly integrates text-to-image generation, various image processing techniques, and deep learning methods for reflection removal and 3D reconstruction.<n>By leveraging state-of-the-art generative models like Stable Diffusion, the methodology translates natural language inputs into detailed 3D models through a multi-stage workflow.<n>This approach addresses key challenges in generative reconstruction, such as maintaining semantic coherence, managing geometric complexity, and preserving detailed visual information.
arXiv Detail & Related papers (2025-03-05T16:54:15Z)
A Survey of World Models for Autonomous Driving [63.33363128964687]
Recent breakthroughs in autonomous driving have been propelled by advances in robust world modeling.<n>This paper systematically reviews recent advances in world models for autonomous driving.
arXiv Detail & Related papers (2025-01-20T04:00:02Z)
A Comprehensive Review of 3D Object Detection in Autonomous Driving: Technological Advances and Future Directions [11.071271817366739]
3D object perception has become a crucial component in the development of autonomous driving systems. This review extensively summarizes traditional 3D object detection methods, focusing on camera-based, LiDAR-based, and fusion detection techniques. We discuss future directions, including methods to improve accuracy such as temporal perception, occupancy grids, and end-to-end learning frameworks.
arXiv Detail & Related papers (2024-08-28T01:08:33Z)
Survey on Fundamental Deep Learning 3D Reconstruction Techniques [0.0]
This survey aims to investigate fundamental deep learning (DL) based 3D reconstruction techniques that produce photo-realistic 3D models and scenes. We dissect the underlying algorithms, evaluate their strengths and tradeoffs, and project future research trajectories in this rapidly evolving field.
arXiv Detail & Related papers (2024-07-11T02:30:05Z)
Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns. A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z)
HUM3DIL: Semi-supervised Multi-modal 3D Human Pose Estimation for Autonomous Driving [95.42203932627102]
3D human pose estimation is an emerging technology, which can enable the autonomous vehicle to perceive and understand the subtle and complex behaviors of pedestrians. Our method efficiently makes use of these complementary signals, in a semi-supervised fashion and outperforms existing methods with a large margin. Specifically, we embed LiDAR points into pixel-aligned multi-modal features, which we pass through a sequence of Transformer refinement stages.
arXiv Detail & Related papers (2022-12-15T11:15:14Z)
3D Object Detection for Autonomous Driving: A Comprehensive Survey [48.30753402458884]
3D object detection, which intelligently predicts the locations, sizes, and categories of the critical 3D objects near an autonomous vehicle, is an important part of a perception system. This paper reviews the advances in 3D object detection for autonomous driving.
arXiv Detail & Related papers (2022-06-19T19:43:11Z)
Self Context and Shape Prior for Sensorless Freehand 3D Ultrasound Reconstruction [61.62191904755521]
3D freehand US reconstruction is promising in addressing the problem by providing broad range and freeform scan. Existing deep learning based methods only focus on the basic cases of skill sequences. We propose a novel approach to sensorless freehand 3D US reconstruction considering the complex skill sequences.
arXiv Detail & Related papers (2021-07-31T16:06:50Z)
Active 3D Shape Reconstruction from Vision and Touch [66.08432412497443]
Humans build 3D understandings of the world through active object exploration, using jointly their senses of vision and touch. In 3D shape reconstruction, most recent progress has relied on static datasets of limited sensory data such as RGB images, depth maps or haptic readings. We introduce a system composed of: 1) a haptic simulator leveraging high spatial resolution vision-based tactile sensors for active touching of 3D objects; 2) a mesh-based 3D shape reconstruction model that relies on tactile or visuotactile priors to guide the shape exploration; and 3) a set of data-driven solutions with either tactile or visuo
arXiv Detail & Related papers (2021-07-20T15:56:52Z)
Deep Learning for Multi-View Stereo via Plane Sweep: A Survey [0.0]
3D reconstruction has lately attracted increasing attention due to its wide application in many areas, such as autonomous driving, robotics and virtual reality. As a dominant technique in artificial intelligence, deep learning has been successfully adopted to solve various computer vision problems. This paper presents a review of recent progress in deep learning methods for Multi-view Stereo (MVS), which is considered as a crucial task of image-based 3D reconstruction.
arXiv Detail & Related papers (2021-06-18T14:10:44Z)
On the generalization of learning-based 3D reconstruction [10.516860541554632]
We study the inductive biases encoded in the model architecture that impact the generalization of learning-based 3D reconstruction methods. We find that 3 inductive biases impact performance: the spatial extent of the encoder, the use of the underlying geometry of the scene to describe point features, and the mechanism to aggregate information from multiple views.
arXiv Detail & Related papers (2020-06-27T18:53:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.