Related papers: 3D Object Visibility Prediction in Autonomous Driving

3D Object Visibility Prediction in Autonomous Driving

URL: http://arxiv.org/abs/2403.03681v1
Date: Wed, 6 Mar 2024 13:07:42 GMT
Title: 3D Object Visibility Prediction in Autonomous Driving
Authors: Chuanyu Luo, Nuo Cheng, Ren Zhong, Haipeng Jiang, Wenyu Chen, Aoli Wang, Pu Li
Abstract summary: We present a novel attribute and its corresponding algorithm: 3D object visibility. Our proposal of this attribute and its computational strategy aims to expand the capabilities for downstream tasks.
Score: 6.802572869909114
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: With the rapid advancement of hardware and software technologies, research in autonomous driving has seen significant growth. The prevailing framework for multi-sensor autonomous driving encompasses sensor installation, perception, path planning, decision-making, and motion control. At the perception phase, a common approach involves utilizing neural networks to infer 3D bounding box (Bbox) attributes from raw sensor data, including classification, size, and orientation. In this paper, we present a novel attribute and its corresponding algorithm: 3D object visibility. By incorporating multi-task learning, the introduction of this attribute, visibility, negligibly affects the model's effectiveness and efficiency. Our proposal of this attribute and its computational strategy aims to expand the capabilities for downstream tasks, thereby enhancing the safety and reliability of real-time autonomous driving in real-world scenarios.

Related papers

Inverse++: Vision-Centric 3D Semantic Occupancy Prediction Assisted with 3D Object Detection [11.33083039877258]
3D semantic occupancy prediction aims to forecast detailed geometric and semantic information of the surrounding environment for autonomous vehicles. We introduce an additional 3D supervision signal by incorporating an additional 3D object detection auxiliary branch. Our approach attains state-of-the-art results, achieving an IoU score of 31.73% and a mIoU score of 20.91%.
arXiv Detail & Related papers (2025-04-07T05:08:22Z)
OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual Reasoning [68.45848423501927]
We propose a holistic vision-language dataset that aligns agent models with 3D driving tasks through counterfactual reasoning. Our approach enhances decision-making by evaluating potential scenarios and their outcomes, similar to human drivers considering alternative actions.
arXiv Detail & Related papers (2025-04-06T03:54:21Z)
ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving [44.174489160967056]
Offboard perception aims to automatically generate high-quality 3D labels for autonomous driving scenes. We propose a novel Zero-shot Offboard Panoptic Perception (ZOPP) framework for autonomous driving scenes. ZOPP integrates the powerful zero-shot recognition capabilities of vision foundation models and 3D representations derived from point clouds.
arXiv Detail & Related papers (2024-11-08T03:52:32Z)
A Comprehensive Review of 3D Object Detection in Autonomous Driving: Technological Advances and Future Directions [11.071271817366739]
3D object perception has become a crucial component in the development of autonomous driving systems. This review extensively summarizes traditional 3D object detection methods, focusing on camera-based, LiDAR-based, and fusion detection techniques. We discuss future directions, including methods to improve accuracy such as temporal perception, occupancy grids, and end-to-end learning frameworks.
arXiv Detail & Related papers (2024-08-28T01:08:33Z)
OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual Reasoning [68.45848423501927]
We propose a holistic vision-language dataset that aligns agent models with 3D driving tasks through counterfactual reasoning. Our approach enhances decision-making by evaluating potential scenarios and their outcomes, similar to human drivers considering alternative actions.
arXiv Detail & Related papers (2024-05-02T17:59:24Z)
Multi-task Learning for Real-time Autonomous Driving Leveraging Task-adaptive Attention Generator [15.94714567272497]
We present a new real-time multi-task network adept at three vital autonomous driving tasks: monocular 3D object detection, semantic segmentation, and dense depth estimation. To counter the challenge of negative transfer, which is the prevalent issue in multi-task learning, we introduce a task-adaptive attention generator. Our rigorously optimized network, when tested on the Cityscapes-3D datasets, consistently outperforms various baseline models.
arXiv Detail & Related papers (2024-03-06T05:04:40Z)
End-to-end Autonomous Driving: Challenges and Frontiers [45.391430626264764]
We provide a comprehensive analysis of more than 270 papers, covering the motivation, roadmap, methodology, challenges, and future trends in end-to-end autonomous driving. We delve into several critical challenges, including multi-modality, interpretability, causal confusion, robustness, and world models, amongst others. We discuss current advancements in foundation models and visual pre-training, as well as how to incorporate these techniques within the end-to-end driving framework.
arXiv Detail & Related papers (2023-06-29T14:17:24Z)
Camera-Radar Perception for Autonomous Vehicles and ADAS: Concepts, Datasets and Metrics [77.34726150561087]
This work aims to carry out a study on the current scenario of camera and radar-based perception for ADAS and autonomous vehicles. Concepts and characteristics related to both sensors, as well as to their fusion, are presented. We give an overview of the Deep Learning-based detection and segmentation tasks, and the main datasets, metrics, challenges, and open questions in vehicle perception.
arXiv Detail & Related papers (2023-03-08T00:48:32Z)
Visual Perception System for Autonomous Driving [9.659835301514288]
This work introduces a visual-based perception system for autonomous driving that integrates trajectory tracking and prediction of moving objects to prevent collisions. The system leverages motion cues from pedestrians to monitor and forecast their movements and simultaneously maps the environment. The performance, efficiency, and resilience of this approach are substantiated through comprehensive evaluations of both simulated and real-world datasets.
arXiv Detail & Related papers (2023-03-03T23:12:43Z)
HUM3DIL: Semi-supervised Multi-modal 3D Human Pose Estimation for Autonomous Driving [95.42203932627102]
3D human pose estimation is an emerging technology, which can enable the autonomous vehicle to perceive and understand the subtle and complex behaviors of pedestrians. Our method efficiently makes use of these complementary signals, in a semi-supervised fashion and outperforms existing methods with a large margin. Specifically, we embed LiDAR points into pixel-aligned multi-modal features, which we pass through a sequence of Transformer refinement stages.
arXiv Detail & Related papers (2022-12-15T11:15:14Z)
3D Object Detection for Autonomous Driving: A Comprehensive Survey [48.30753402458884]
3D object detection, which intelligently predicts the locations, sizes, and categories of the critical 3D objects near an autonomous vehicle, is an important part of a perception system. This paper reviews the advances in 3D object detection for autonomous driving.
arXiv Detail & Related papers (2022-06-19T19:43:11Z)
Improving Robustness of Learning-based Autonomous Steering Using Adversarial Images [58.287120077778205]
We introduce a framework for analyzing robustness of the learning algorithm w.r.t varying quality in the image input for autonomous driving. Using the results of sensitivity analysis, we propose an algorithm to improve the overall performance of the task of "learning to steer"
arXiv Detail & Related papers (2021-02-26T02:08:07Z)
Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device [53.323878851563414]
We propose a compiler-aware unified framework incorporating network enhancement and pruning search with the reinforcement learning techniques. Specifically, a generator Recurrent Neural Network (RNN) is employed to provide the unified scheme for both network enhancement and pruning search automatically. The proposed framework achieves real-time 3D object detection on mobile devices with competitive detection performance.
arXiv Detail & Related papers (2020-12-26T19:41:15Z)
Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images. Our approach is fully automatic without any human interaction. We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.