Semantic Scene Completion Based 3D Traversability Estimation for Off-Road Terrains
- URL: http://arxiv.org/abs/2412.08195v1
- Date: Wed, 11 Dec 2024 08:36:36 GMT
- Title: Semantic Scene Completion Based 3D Traversability Estimation for Off-Road Terrains
- Authors: Zitong Chen, Chao Sun, Shida Nie, Chen Min, Changjiu Ning, Haoyu Li, Bo Wang,
- Abstract summary: Off-road environments present significant challenges for autonomous ground vehicles.<n>Traditional perception algorithms, designed primarily for structured environments, often fail under these conditions.<n>In this paper, ORDformer is proposed to generate dense traversable occupancy predictions from a forward-facing perspective.
- Score: 10.521569910467072
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Off-road environments present significant challenges for autonomous ground vehicles due to the absence of structured roads and the presence of complex obstacles, such as uneven terrain, vegetation, and occlusions. Traditional perception algorithms, designed primarily for structured environments, often fail under these conditions, leading to inaccurate traversability estimations. In this paper, ORDformer, a novel multimodal method that combines LiDAR point clouds with monocular images, is proposed to generate dense traversable occupancy predictions from a forward-facing perspective. By integrating multimodal data, environmental feature extraction is enhanced, which is crucial for accurate occupancy estimation in complex terrains. Furthermore, RELLIS-OCC, a dataset with 3D traversable occupancy annotations, is introduced, incorporating geometric features such as step height, slope, and unevenness. Through a comprehensive analysis of vehicle obstacle-crossing conditions and the incorporation of vehicle body structure constraints, four traversability cost labels are generated: lethal, medium-cost, low-cost, and free. Experimental results demonstrate that ORDformer outperforms existing approaches in 3D traversable area recognition, particularly in off-road environments with irregular geometries and partial occlusions. Specifically, ORDformer achieves over a 20\% improvement in scene completion IoU compared to other models. The proposed framework is scalable and adaptable to various vehicle platforms, allowing for adjustments to occupancy grid parameters and the integration of advanced dynamic models for traversability cost estimation.
Related papers
- PFSD: A Multi-Modal Pedestrian-Focus Scene Dataset for Rich Tasks in Semi-Structured Environments [73.80718037070773]
We present the multi-modal Pedestrian-Focused Scene dataset, rigorously annotated in semi-structured scenes with the format of nuScenes.
We also propose a novel Hybrid Multi-Scale Fusion Network (HMFN) to detect pedestrians in densely populated and occluded scenarios.
arXiv Detail & Related papers (2025-02-21T09:57:53Z) - AdaOcc: Adaptive-Resolution Occupancy Prediction [20.0994984349065]
We introduce AdaOcc, a novel adaptive-resolution, multi-modal prediction approach.
Our method integrates object-centric 3D reconstruction and holistic occupancy prediction within a single framework.
In close-range scenarios, we surpass previous baselines by over 13% in IOU, and over 40% in Hausdorff distance.
arXiv Detail & Related papers (2024-08-24T03:46:25Z) - OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments [77.0399450848749]
We propose an OccNeRF method for training occupancy networks without 3D supervision.
We parameterize the reconstructed occupancy fields and reorganize the sampling strategy to align with the cameras' infinite perceptive range.
For semantic occupancy prediction, we design several strategies to polish the prompts and filter the outputs of a pretrained open-vocabulary 2D segmentation model.
arXiv Detail & Related papers (2023-12-14T18:58:52Z) - MV-DeepSDF: Implicit Modeling with Multi-Sweep Point Clouds for 3D
Vehicle Reconstruction in Autonomous Driving [25.088617195439344]
We propose a novel framework, dubbed MV-DeepSDF, which estimates the optimal Signed Distance Function (SDF) shape representation from multi-sweep point clouds.
We conduct thorough experiments on two real-world autonomous driving datasets.
arXiv Detail & Related papers (2023-08-21T15:48:15Z) - METAVerse: Meta-Learning Traversability Cost Map for Off-Road Navigation [5.036362492608702]
This paper presents METAVerse, a meta-learning framework for learning a global model that accurately predicts terrain traversability.
We train the traversability prediction network to generate a dense and continuous-terrain cost map from a sparse LiDAR point cloud.
Online adaptation is performed to rapidly adapt the network to the local environment by exploiting recent interaction experiences.
arXiv Detail & Related papers (2023-07-26T06:58:19Z) - Monocular BEV Perception of Road Scenes via Front-to-Top View Projection [57.19891435386843]
We present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view.
Our model runs at 25 FPS on a single GPU, which is efficient and applicable for real-time panorama HD map reconstruction.
arXiv Detail & Related papers (2022-11-15T13:52:41Z) - Uncertainty-aware Perception Models for Off-road Autonomous Unmanned
Ground Vehicles [6.2574402913714575]
Off-road autonomous unmanned ground vehicles (UGVs) are being developed for military and commercial use to deliver crucial supplies in remote locations.
Current datasets used to train perception models for off-road autonomous navigation lack of diversity in seasons, locations, semantic classes, as well as time of day.
We investigate how to combine multiple datasets to train a semantic segmentation-based environment perception model.
We show that training the model to capture uncertainty could improve the model performance by a significant margin.
arXiv Detail & Related papers (2022-09-22T15:59:33Z) - Decentralized Vehicle Coordination: The Berkeley DeepDrive Drone Dataset and Consensus-Based Models [76.32775745488073]
We present a novel dataset and modeling framework designed to study motion planning in understructured environments.
We demonstrate that a consensus-based modeling approach can effectively explain the emergence of priority orders observed in our dataset.
arXiv Detail & Related papers (2022-09-19T05:06:57Z) - Multi-Modal Fusion Transformer for End-to-End Autonomous Driving [59.60483620730437]
We propose TransFuser, a novel Multi-Modal Fusion Transformer, to integrate image and LiDAR representations using attention.
Our approach achieves state-of-the-art driving performance while reducing collisions by 76% compared to geometry-based fusion.
arXiv Detail & Related papers (2021-04-19T11:48:13Z) - Detecting 32 Pedestrian Attributes for Autonomous Vehicles [103.87351701138554]
In this paper, we address the problem of jointly detecting pedestrians and recognizing 32 pedestrian attributes.
We introduce a Multi-Task Learning (MTL) model relying on a composite field framework, which achieves both goals in an efficient way.
We show competitive detection and attribute recognition results, as well as a more stable MTL training.
arXiv Detail & Related papers (2020-12-04T15:10:12Z) - Reconfigurable Voxels: A New Representation for LiDAR-Based Point Clouds [76.52448276587707]
We propose Reconfigurable Voxels, a new approach to constructing representations from 3D point clouds.
Specifically, we devise a biased random walk scheme, which adaptively covers each neighborhood with a fixed number of voxels.
We find that this approach effectively improves the stability of voxel features, especially for sparse regions.
arXiv Detail & Related papers (2020-04-06T15:07:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.