DB3D-L: Depth-aware BEV Feature Transformation for Accurate 3D Lane Detection
- URL: http://arxiv.org/abs/2505.13266v1
- Date: Mon, 19 May 2025 15:47:20 GMT
- Title: DB3D-L: Depth-aware BEV Feature Transformation for Accurate 3D Lane Detection
- Authors: Yehao Liu, Xiaosu Xu, Zijian Wang, Yiqing Yao,
- Abstract summary: 3D Lane detection plays an important role in autonomous driving.<n>Recent advances primarily build Birds-Eye-View (BEV) feature from front-view (FV) images to perceive 3D information of Lane more effectively.<n>However, constructing accurate BEV information from FV image is limited due to the lacking of depth information.
- Score: 3.7115000857388494
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: 3D Lane detection plays an important role in autonomous driving. Recent advances primarily build Birds-Eye-View (BEV) feature from front-view (FV) images to perceive 3D information of Lane more effectively. However, constructing accurate BEV information from FV image is limited due to the lacking of depth information, causing previous works often rely heavily on the assumption of a flat ground plane. Leveraging monocular depth estimation to assist in constructing BEV features is less constrained, but existing methods struggle to effectively integrate the two tasks. To address the above issue, in this paper, an accurate 3D lane detection method based on depth-aware BEV feature transtormation is proposed. In detail, an effective feature extraction module is designed, in which a Depth Net is integrated to obtain the vital depth information for 3D perception, thereby simplifying the complexity of view transformation. Subquently a feature reduce module is proposed to reduce height dimension of FV features and depth features, thereby enables effective fusion of crucial FV features and depth features. Then a fusion module is designed to build BEV feature from prime FV feature and depth information. The proposed method performs comparably with state-of-the-art methods on both synthetic Apollo, realistic OpenLane datasets.
Related papers
- Depth3DLane: Monocular 3D Lane Detection via Depth Prior Distillation [5.909083729156255]
We introduce a BEV-based framework to address limitations and improve 3D lane detection accuracy.<n>We leverage Depth Prior Distillation to transfer semantic depth knowledge from a teacher model.<n>Our method achieves state-of-the-art performance in terms of z-axis error.
arXiv Detail & Related papers (2025-04-25T13:08:41Z) - ROA-BEV: 2D Region-Oriented Attention for BEV-based 3D Object Detection [10.11767889540451]
We propose a BEV-based 3D Object Detection Network with 2D Region-Oriented Attention (ROA-BEV)<n>Our method further enhances the information feature learning ability of ROA through multi-scale structures.<n> Experiments on nuScenes show that ROA-BEV improves the performance based on BEVDepth.
arXiv Detail & Related papers (2024-10-14T08:51:56Z) - OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection [102.0744303467713]
We propose a new multi-view 3D object detector named OPEN.
Our main idea is to effectively inject object-wise depth information into the network through our proposed object-wise position embedding.
OPEN achieves a new state-of-the-art performance with 64.4% NDS and 56.7% mAP on the nuScenes test benchmark.
arXiv Detail & Related papers (2024-07-15T14:29:15Z) - Towards Unified 3D Object Detection via Algorithm and Data Unification [70.27631528933482]
We build the first unified multi-modal 3D object detection benchmark MM- Omni3D and extend the aforementioned monocular detector to its multi-modal version.
We name the designed monocular and multi-modal detectors as UniMODE and MM-UniMODE, respectively.
arXiv Detail & Related papers (2024-02-28T18:59:31Z) - Instance-aware Multi-Camera 3D Object Detection with Structural Priors
Mining and Self-Boosting Learning [93.71280187657831]
Camera-based bird-eye-view (BEV) perception paradigm has made significant progress in the autonomous driving field.
We propose IA-BEV, which integrates image-plane instance awareness into the depth estimation process within a BEV-based detector.
arXiv Detail & Related papers (2023-12-13T09:24:42Z) - CoBEV: Elevating Roadside 3D Object Detection with Depth and Height Complementarity [34.025530326420146]
We develop Complementary-BEV, a novel end-to-end monocular 3D object detection framework.
We conduct extensive experiments on the public 3D detection benchmarks of roadside camera-based DAIR-V2X-I and Rope3D.
For the first time, the vehicle AP score of a camera model reaches 80% on DAIR-V2X-I in terms of easy mode.
arXiv Detail & Related papers (2023-10-04T13:38:53Z) - BEV-IO: Enhancing Bird's-Eye-View 3D Detection with Instance Occupancy [58.92659367605442]
We present BEV-IO, a new 3D detection paradigm to enhance BEV representation with instance occupancy information.
We show that BEV-IO can outperform state-of-the-art methods while only adding a negligible increase in parameters and computational overhead.
arXiv Detail & Related papers (2023-05-26T11:16:12Z) - EA-LSS: Edge-aware Lift-splat-shot Framework for 3D BEV Object Detection [9.289537252177048]
We propose a novel Edge-aware Lift-splat-shot (EA-LSS) framework for 3D object detection.
Our EA-LSS framework is compatible for any LSS-based 3D object detection models.
arXiv Detail & Related papers (2023-03-31T08:56:29Z) - OA-BEV: Bringing Object Awareness to Bird's-Eye-View Representation for
Multi-Camera 3D Object Detection [78.38062015443195]
OA-BEV is a network that can be plugged into the BEV-based 3D object detection framework.
Our method achieves consistent improvements over the BEV-based baselines in terms of both average precision and nuScenes detection score.
arXiv Detail & Related papers (2023-01-13T06:02:31Z) - Rethinking Dimensionality Reduction in Grid-based 3D Object Detection [24.249147412551768]
We propose a novel point cloud detection network based on a Multi-level feature dimensionality reduction strategy, called MDRNet.
In MDRNet, the Spatial-aware Dimensionality Reduction (SDR) is designed to dynamically focus on the valuable parts of the object during voxel-to-BEV feature transformation.
Experiments on nuScenes show that the proposed method outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2022-09-20T04:51:54Z) - M$^2$-3DLaneNet: Exploring Multi-Modal 3D Lane Detection [30.250833348463633]
M$2$-3DLaneNet lifts 2D features into 3D space by incorporating geometry information from LiDAR data through depth completion.
Experiments on the large-scale OpenLane dataset demonstrate the effectiveness of M$2$-3DLaneNet, regardless of the range.
arXiv Detail & Related papers (2022-09-13T13:45:18Z) - Joint Learning of Salient Object Detection, Depth Estimation and Contour
Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD)
Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks.
Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.