A Simple Framework for 3D Occupancy Estimation in Autonomous Driving
- URL: http://arxiv.org/abs/2303.10076v5
- Date: Fri, 17 Nov 2023 04:25:19 GMT
- Title: A Simple Framework for 3D Occupancy Estimation in Autonomous Driving
- Authors: Wanshui Gan, Ningkai Mo, Hongbin Xu, Naoto Yokoya
- Abstract summary: We present a CNN-based framework designed to reveal several key factors for 3D occupancy estimation.
We also explore the relationship between 3D occupancy estimation and other related tasks, such as monocular depth estimation and 3D reconstruction.
- Score: 16.605853706182696
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The task of estimating 3D occupancy from surrounding-view images is an
exciting development in the field of autonomous driving, following the success
of Bird's Eye View (BEV) perception. This task provides crucial 3D attributes
of the driving environment, enhancing the overall understanding and perception
of the surrounding space. In this work, we present a simple framework for 3D
occupancy estimation, which is a CNN-based framework designed to reveal several
key factors for 3D occupancy estimation, such as network design, optimization,
and evaluation. In addition, we explore the relationship between 3D occupancy
estimation and other related tasks, such as monocular depth estimation and 3D
reconstruction, which could advance the study of 3D perception in autonomous
driving. For evaluation, we propose a simple sampling strategy to define the
metric for occupancy evaluation, which is flexible for current public datasets.
Moreover, we establish the benchmark in terms of the depth estimation metric,
where we compare our proposed method with monocular depth estimation methods on
the DDAD and Nuscenes datasets and achieve competitive performance. The
relevant code will be updated in https://github.com/GANWANSHUI/SimpleOccupancy.
Related papers
- Improving 3D Occupancy Prediction through Class-balancing Loss and Multi-scale Representation [7.651064601670273]
3D environment recognition is essential for autonomous driving systems.
Birds-Eye-View(BEV)-based perception has achieved the SOTA performance for this task.
We introduce a novel UNet-like Multi-scale Occupancy Head module to relieve this issue.
arXiv Detail & Related papers (2024-05-25T07:13:13Z) - Vision-based 3D occupancy prediction in autonomous driving: a review and outlook [19.939380586314673]
We introduce the background of vision-based 3D occupancy prediction and discuss the challenges in this task.
We conduct a comprehensive survey of the progress in vision-based 3D occupancy prediction from three aspects.
We present a summary of prevailing research trends and propose some inspiring future outlooks.
arXiv Detail & Related papers (2024-05-04T07:39:25Z) - RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering
Assisted Distillation [50.35403070279804]
3D occupancy prediction is an emerging task that aims to estimate the occupancy states and semantics of 3D scenes using multi-view images.
We propose RadOcc, a Rendering assisted distillation paradigm for 3D Occupancy prediction.
arXiv Detail & Related papers (2023-12-19T03:39:56Z) - OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments [77.0399450848749]
We propose an OccNeRF method for training occupancy networks without 3D supervision.
We parameterize the reconstructed occupancy fields and reorganize the sampling strategy to align with the cameras' infinite perceptive range.
For semantic occupancy prediction, we design several strategies to polish the prompts and filter the outputs of a pretrained open-vocabulary 2D segmentation model.
arXiv Detail & Related papers (2023-12-14T18:58:52Z) - Instance-aware Multi-Camera 3D Object Detection with Structural Priors
Mining and Self-Boosting Learning [93.71280187657831]
Camera-based bird-eye-view (BEV) perception paradigm has made significant progress in the autonomous driving field.
We propose IA-BEV, which integrates image-plane instance awareness into the depth estimation process within a BEV-based detector.
arXiv Detail & Related papers (2023-12-13T09:24:42Z) - SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection [19.75965521357068]
We propose a novel approach called SOGDet (Semantic-Occupancy Guided Multi-view 3D Object Detection) to improve the accuracy of 3D object detection.
Our results show that SOGDet consistently enhance the performance of three baseline methods in terms of nuScenes Detection Score (NDS) and mean Average Precision (mAP)
This indicates that the combination of 3D object detection and 3D semantic occupancy leads to a more comprehensive perception of the 3D environment, thereby aiding build more robust autonomous driving systems.
arXiv Detail & Related papers (2023-08-26T07:38:21Z) - HUM3DIL: Semi-supervised Multi-modal 3D Human Pose Estimation for
Autonomous Driving [95.42203932627102]
3D human pose estimation is an emerging technology, which can enable the autonomous vehicle to perceive and understand the subtle and complex behaviors of pedestrians.
Our method efficiently makes use of these complementary signals, in a semi-supervised fashion and outperforms existing methods with a large margin.
Specifically, we embed LiDAR points into pixel-aligned multi-modal features, which we pass through a sequence of Transformer refinement stages.
arXiv Detail & Related papers (2022-12-15T11:15:14Z) - ONCE-3DLanes: Building Monocular 3D Lane Detection [41.46466150783367]
We present ONCE-3DLanes, a real-world autonomous driving dataset with lane layout annotation in 3D space.
By exploiting the explicit relationship between point clouds and image pixels, a dataset annotation pipeline is designed to automatically generate high-quality 3D lane locations.
arXiv Detail & Related papers (2022-04-30T16:35:25Z) - From 2D to 3D: Re-thinking Benchmarking of Monocular Depth Prediction [80.67873933010783]
We argue that MDP is currently witnessing benchmark over-fitting and relying on metrics that are only partially helpful to gauge the usefulness of the predictions for 3D applications.
This limits the design and development of novel methods that are truly aware of - and improving towards estimating - the 3D structure of the scene rather than optimizing 2D-based distances.
We propose a set of metrics well suited to evaluate the 3D geometry of MDP approaches and a novel indoor benchmark, RIO-D3D, crucial for the proposed evaluation methodology.
arXiv Detail & Related papers (2022-03-15T17:50:54Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.