Related papers: Out-of-Distribution Semantic Occupancy Prediction

Out-of-Distribution Semantic Occupancy Prediction

URL: http://arxiv.org/abs/2506.21185v1
Date: Thu, 26 Jun 2025 12:44:29 GMT
Title: Out-of-Distribution Semantic Occupancy Prediction
Authors: Yuheng Zhang, Mengfei Duan, Kunyu Peng, Yuhang Wang, Ruiping Liu, Fei Teng, Kai Luo, Zhiyong Li, Kailun Yang,
Abstract summary: We introduce Out-of-Distribution Semantic Occupancy Prediction, targeting OoD detection in 3D voxel space.<n>We introduce OccOoD, a novel framework integrating OoD detection into 3D semantic occupancy prediction, with Voxel-BEV Progressive Fusion (VBPF)<n> Experimental results demonstrate that OccOoD achieves state-of-the-art OoD detection with an AuROC of 67.34% and an AuPRCr of 29.21% within a 1.2m region, while maintaining competitive occupancy prediction performance.
Score: 26.178096447889605
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: 3D Semantic Occupancy Prediction is crucial for autonomous driving, providing a dense, semantically rich environmental representation. However, existing methods focus on in-distribution scenes, making them susceptible to Out-of-Distribution (OoD) objects and long-tail distributions, which increases the risk of undetected anomalies and misinterpretations, posing safety hazards. To address these challenges, we introduce Out-of-Distribution Semantic Occupancy Prediction, targeting OoD detection in 3D voxel space. To fill the gaps in the dataset, we propose a Synthetic Anomaly Integration Pipeline that injects synthetic anomalies while preserving realistic spatial and occlusion patterns, enabling the creation of two datasets: VAA-KITTI and VAA-KITTI-360. We introduce OccOoD, a novel framework integrating OoD detection into 3D semantic occupancy prediction, with Voxel-BEV Progressive Fusion (VBPF) leveraging an RWKV-based branch to enhance OoD detection via geometry-semantic fusion. Experimental results demonstrate that OccOoD achieves state-of-the-art OoD detection with an AuROC of 67.34% and an AuPRCr of 29.21% within a 1.2m region, while maintaining competitive occupancy prediction performance. The established datasets and source code will be made publicly available at https://github.com/7uHeng/OccOoD.

Related papers

Stereo-based 3D Anomaly Object Detection for Autonomous Driving: A New Dataset and Baseline [10.933880236559604]
3D detection technology is widely used in the field of autonomous driving.<n>For rare anomaly categories that appear on the road, 3D detection models often misdetect or fail to detect anomalies.<n>This paper proposes a Stereo-based 3D Anomaly object Detection (S3AD) algorithm.
arXiv Detail & Related papers (2025-07-12T09:10:29Z)
An Experimental Study on Joint Modeling for Sound Event Localization and Detection with Source Distance Estimation [3.2637535969755858]
The 3D SELD task addresses the limitation by integrating source distance estimation.<n>We propose three approaches to tackle this challenge: a novel method with independent training and joint prediction.<n>Our proposed method ranked first in the DCASE 2024 Challenge Task 3, demonstrating the effectiveness of joint modeling.
arXiv Detail & Related papers (2025-01-18T12:57:21Z)
EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding [63.99937807085461]
3D occupancy prediction provides a comprehensive description of the surrounding scenes.<n>Most existing methods focus on offline perception from one or a few views.<n>We formulate an embodied 3D occupancy prediction task to target this practical scenario and propose a Gaussian-based EmbodiedOcc framework to accomplish it.
arXiv Detail & Related papers (2024-12-05T17:57:09Z)
ALOcc: Adaptive Lifting-based 3D Semantic Occupancy and Cost Volume-based Flow Prediction [89.89610257714006]
Existing methods prioritize higher accuracy to cater to the demands of these tasks. We introduce a series of targeted improvements for 3D semantic occupancy prediction and flow estimation. Our purelytemporalal architecture framework, named ALOcc, achieves an optimal tradeoff between speed and accuracy.
arXiv Detail & Related papers (2024-11-12T11:32:56Z)
FuzzRisk: Online Collision Risk Estimation for Autonomous Vehicles based on Depth-Aware Object Detection via Fuzzy Inference [6.856508678236828]
The framework takes two sets of predictions from different algorithms and associates their inconsistencies with the collision risk via fuzzy inference.<n>We experimentally validate that, based on Intersection-over-Union (IoU) and a depth discrepancy measure, the inconsistencies between the two sets of predictions strongly correlate to the error of the 3D object detector.<n>We optimize the fuzzy inference system towards an existing offline metric that matches AV collision rates well.
arXiv Detail & Related papers (2024-11-09T20:20:36Z)
Syn-to-Real Unsupervised Domain Adaptation for Indoor 3D Object Detection [50.448520056844885]
We propose a novel framework for syn-to-real unsupervised domain adaptation in indoor 3D object detection. Our adaptation results from synthetic dataset 3D-FRONT to real-world datasets ScanNetV2 and SUN RGB-D demonstrate remarkable mAP25 improvements of 9.7% and 9.1% over Source-Only baselines.
arXiv Detail & Related papers (2024-06-17T08:18:41Z)
Homography Loss for Monocular 3D Object Detection [54.04870007473932]
A differentiable loss function, termed as Homography Loss, is proposed to achieve the goal, which exploits both 2D and 3D information. Our method yields the best performance compared with the other state-of-the-arts by a large margin on KITTI 3D datasets.
arXiv Detail & Related papers (2022-04-02T03:48:03Z)
Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations. We derive suitable measures to quantify prediction uncertainty at both pose and joint level. We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z)
RAANet: Range-Aware Attention Network for LiDAR-based 3D Object Detection with Auxiliary Density Level Estimation [11.180128679075716]
Range-Aware Attention Network (RAANet) is developed for 3D object detection from LiDAR data for autonomous driving. RAANet extracts more powerful BEV features and generates superior 3D object detections. Experiments on nuScenes dataset demonstrate that our proposed approach outperforms the state-of-the-art methods for LiDAR-based 3D object detection.
arXiv Detail & Related papers (2021-11-18T04:20:13Z)
SESS: Self-Ensembling Semi-Supervised 3D Object Detection [138.80825169240302]
We propose SESS, a self-ensembling semi-supervised 3D object detection framework. Specifically, we design a thorough perturbation scheme to enhance generalization of the network on unlabeled and new unseen data. Our SESS achieves competitive performance compared to the state-of-the-art fully-supervised method by using only 50% labeled data.
arXiv Detail & Related papers (2019-12-26T08:48:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.