Large-Scale 3D Semantic Reconstruction for Automated Driving Vehicles
with Adaptive Truncated Signed Distance Function
- URL: http://arxiv.org/abs/2202.13855v1
- Date: Mon, 28 Feb 2022 15:11:25 GMT
- Title: Large-Scale 3D Semantic Reconstruction for Automated Driving Vehicles
with Adaptive Truncated Signed Distance Function
- Authors: Haohao Hu, Hexing Yang, Jian Wu, Xiao Lei, Frank Bieder, Jan-Hendrik
Pauls and Christoph Stiller
- Abstract summary: We propose a novel 3D reconstruction and semantic mapping system using LiDAR and camera sensors.
An Adaptive Truncated Function is introduced to describe surfaces implicitly, which can deal with different LiDAR point sparsities.
An optimal image patch selection strategy is proposed to estimate the optimal semantic class for each triangle mesh.
- Score: 9.414880946870916
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Large-scale 3D reconstruction, texturing and semantic mapping are
nowadays widely used for automated driving vehicles, virtual reality and
automatic data generation. However, most approaches are developed for RGB-D
cameras with colored dense point clouds and not suitable for large-scale
outdoor environments using sparse LiDAR point clouds. Since a 3D surface can be
usually observed from multiple camera images with different view poses, an
optimal image patch selection for the texturing and an optimal semantic class
estimation for the semantic mapping are still challenging.
To address these problems, we propose a novel 3D reconstruction, texturing
and semantic mapping system using LiDAR and camera sensors. An Adaptive
Truncated Signed Distance Function is introduced to describe surfaces
implicitly, which can deal with different LiDAR point sparsities and improve
model quality. The from this implicit function extracted triangle mesh map is
then textured from a series of registered camera images by applying an optimal
image patch selection strategy. Besides that, a Markov Random Field-based data
fusion approach is proposed to estimate the optimal semantic class for each
triangle mesh. Our approach is evaluated on a synthetic dataset, the KITTI
dataset and a dataset recorded with our experimental vehicle. The results show
that the 3D models generated using our approach are more accurate in comparison
to using other state-of-the-art approaches. The texturing and semantic mapping
achieve also very promising results.
Related papers
- Large Spatial Model: End-to-end Unposed Images to Semantic 3D [79.94479633598102]
Large Spatial Model (LSM) processes unposed RGB images directly into semantic radiance fields.
LSM simultaneously estimates geometry, appearance, and semantics in a single feed-forward operation.
It can generate versatile label maps by interacting with language at novel viewpoints.
arXiv Detail & Related papers (2024-10-24T17:54:42Z) - Neural Rendering based Urban Scene Reconstruction for Autonomous Driving [8.007494499012624]
We propose a multimodal 3D scene reconstruction using a framework combining neural implicit surfaces and radiance fields.
Dense 3D reconstruction has many applications in automated driving including automated annotation validation.
We demonstrate qualitative and quantitative results on challenging automotive scenes.
arXiv Detail & Related papers (2024-02-09T23:20:23Z) - SeMLaPS: Real-time Semantic Mapping with Latent Prior Networks and
Quasi-Planar Segmentation [53.83313235792596]
We present a new methodology for real-time semantic mapping from RGB-D sequences.
It combines a 2D neural network and a 3D network based on a SLAM system with 3D occupancy mapping.
Our system achieves state-of-the-art semantic mapping quality within 2D-3D networks-based systems.
arXiv Detail & Related papers (2023-06-28T22:36:44Z) - AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object
Detection [46.03951171790736]
We propose textitAutoAlign, an automatic feature fusion strategy for 3D object detection.
We show that our approach can lead to 2.3 mAP and 7.0 mAP improvements on the KITTI and nuScenes datasets.
arXiv Detail & Related papers (2022-01-17T16:08:57Z) - Extracting Triangular 3D Models, Materials, and Lighting From Images [59.33666140713829]
We present an efficient method for joint optimization of materials and lighting from multi-view image observations.
We leverage meshes with spatially-varying materials and environment that can be deployed in any traditional graphics engine.
arXiv Detail & Related papers (2021-11-24T13:58:20Z) - Using Adaptive Gradient for Texture Learning in Single-View 3D
Reconstruction [0.0]
Learning-based approaches for 3D model reconstruction have attracted attention owing to its modern applications.
We present a novel sampling algorithm by optimizing the gradient of predicted coordinates based on the variance on the sampling image.
We also adopt Frechet Inception Distance (FID) to form a loss function in learning, which helps bridging the gap between rendered images and input images.
arXiv Detail & Related papers (2021-04-29T07:52:54Z) - Geometric Correspondence Fields: Learned Differentiable Rendering for 3D
Pose Refinement in the Wild [96.09941542587865]
We present a novel 3D pose refinement approach based on differentiable rendering for objects of arbitrary categories in the wild.
In this way, we precisely align 3D models to objects in RGB images which results in significantly improved 3D pose estimates.
We evaluate our approach on the challenging Pix3D dataset and achieve up to 55% relative improvement compared to state-of-the-art refinement methods in multiple metrics.
arXiv Detail & Related papers (2020-07-17T12:34:38Z) - PerMO: Perceiving More at Once from a Single Image for Autonomous
Driving [76.35684439949094]
We present a novel approach to detect, segment, and reconstruct complete textured 3D models of vehicles from a single image.
Our approach combines the strengths of deep learning and the elegance of traditional techniques.
We have integrated these algorithms with an autonomous driving system.
arXiv Detail & Related papers (2020-07-16T05:02:45Z) - Stereo RGB and Deeper LIDAR Based Network for 3D Object Detection [40.34710686994996]
3D object detection has become an emerging task in autonomous driving scenarios.
Previous works process 3D point clouds using either projection-based or voxel-based models.
We propose the Stereo RGB and Deeper LIDAR framework which can utilize semantic and spatial information simultaneously.
arXiv Detail & Related papers (2020-06-09T11:19:24Z) - Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled
Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras.
We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points.
Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.