PersFormer: 3D Lane Detection via Perspective Transformer and the
OpenLane Benchmark
- URL: http://arxiv.org/abs/2203.11089v1
- Date: Mon, 21 Mar 2022 16:12:53 GMT
- Title: PersFormer: 3D Lane Detection via Perspective Transformer and the
OpenLane Benchmark
- Authors: Li Chen, Chonghao Sima, Yang Li, Zehan Zheng, Jiajie Xu, Xiangwei
Geng, Hongyang Li, Conghui He, Jianping Shi, Yu Qiao, Junchi Yan
- Abstract summary: PersFormer is an end-to-end monocular 3D lane detector with a novel Transformer-based spatial feature transformation module.
We release one of the first large-scale real-world 3D lane datasets, called OpenLane, with high-quality annotation and scenario diversity.
- Score: 109.03773439461615
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Methods for 3D lane detection have been recently proposed to address the
issue of inaccurate lane layouts in many autonomous driving scenarios
(uphill/downhill, bump, etc.). Previous work struggled in complex cases due to
their simple designs of the spatial transformation between front view and
bird's eye view (BEV) and the lack of a realistic dataset. Towards these
issues, we present PersFormer: an end-to-end monocular 3D lane detector with a
novel Transformer-based spatial feature transformation module. Our model
generates BEV features by attending to related front-view local regions with
camera parameters as a reference. PersFormer adopts a unified 2D/3D anchor
design and an auxiliary task to detect 2D/3D lanes simultaneously, enhancing
the feature consistency and sharing the benefits of multi-task learning.
Moreover, we release one of the first large-scale real-world 3D lane datasets,
which is called OpenLane, with high-quality annotation and scenario diversity.
OpenLane contains 200,000 frames, over 880,000 instance-level lanes, 14 lane
categories, along with scene tags and the closed-in-path object annotations to
encourage the development of lane detection and more industrial-related
autonomous driving methods. We show that PersFormer significantly outperforms
competitive baselines in the 3D lane detection task on our new OpenLane dataset
as well as Apollo 3D Lane Synthetic dataset, and is also on par with
state-of-the-art algorithms in the 2D task on OpenLane. The project page is
available at https://github.com/OpenPerceptionX/OpenLane.
Related papers
- DV-3DLane: End-to-end Multi-modal 3D Lane Detection with Dual-view Representation [40.71071200694655]
We present DV-3DLane, a novel end-to-end Dual-View multi-modal 3D Lane detection framework.
It synergizes the strengths of both images and LiDAR points.
It achieves state-of-the-art performance, with a remarkable 11.2 gain in F1 score and a substantial 53.5% reduction in errors.
arXiv Detail & Related papers (2024-06-23T10:48:42Z) - Enhancing 3D Lane Detection and Topology Reasoning with 2D Lane Priors [40.92232275558338]
3D lane detection and topology reasoning are essential tasks in autonomous driving scenarios.
We propose Topo2D, a novel framework based on Transformer, leveraging 2D lane instances to initialize 3D queries and 3D positional embeddings.
Topo2D achieves 44.5% OLS on multi-view topology reasoning benchmark OpenLane-V2 and 62.6% F-Socre on single-view 3D lane detection benchmark OpenLane.
arXiv Detail & Related papers (2024-06-05T09:48:56Z) - Decoupling the Curve Modeling and Pavement Regression for Lane Detection [67.22629246312283]
curve-based lane representation is a popular approach in many lane detection methods.
We propose a new approach to the lane detection task by decomposing it into two parts: curve modeling and ground height regression.
arXiv Detail & Related papers (2023-09-19T11:24:14Z) - LATR: 3D Lane Detection from Monocular Images with Transformer [42.34193673590758]
3D lane detection from monocular images is a fundamental yet challenging task in autonomous driving.
Recent advances rely on structural 3D surrogates built from front-view image features and camera parameters.
We present a novel LATR model, an end-to-end 3D lane detector that uses 3D-aware front-view features without transformed view representation.
arXiv Detail & Related papers (2023-08-08T21:08:42Z) - An Efficient Transformer for Simultaneous Learning of BEV and Lane
Representations in 3D Lane Detection [55.281369497158515]
We propose an efficient transformer for 3D lane detection.
Different from the vanilla transformer, our model contains a cross-attention mechanism to simultaneously learn lane and BEV representations.
Our method obtains 2D and 3D lane predictions by applying the lane features to the image-view and BEV features, respectively.
arXiv Detail & Related papers (2023-06-08T04:18:31Z) - Fully Sparse Fusion for 3D Object Detection [69.32694845027927]
Currently prevalent multimodal 3D detection methods are built upon LiDAR-based detectors that usually use dense Bird's-Eye-View feature maps.
Fully sparse architecture is gaining attention as they are highly efficient in long-range perception.
In this paper, we study how to effectively leverage image modality in the emerging fully sparse architecture.
arXiv Detail & Related papers (2023-04-24T17:57:43Z) - SWFormer: Sparse Window Transformer for 3D Object Detection in Point
Clouds [44.635939022626744]
3D object detection in point clouds is a core component for modern robotics and autonomous driving systems.
Key challenge in 3D object detection comes from the inherent sparse nature of point occupancy within the 3D scene.
We propose Sparse Window Transformer (SWFormer), a scalable and accurate model for 3D object detection.
arXiv Detail & Related papers (2022-10-13T21:37:53Z) - ONCE-3DLanes: Building Monocular 3D Lane Detection [41.46466150783367]
We present ONCE-3DLanes, a real-world autonomous driving dataset with lane layout annotation in 3D space.
By exploiting the explicit relationship between point clouds and image pixels, a dataset annotation pipeline is designed to automatically generate high-quality 3D lane locations.
arXiv Detail & Related papers (2022-04-30T16:35:25Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - PerMO: Perceiving More at Once from a Single Image for Autonomous
Driving [76.35684439949094]
We present a novel approach to detect, segment, and reconstruct complete textured 3D models of vehicles from a single image.
Our approach combines the strengths of deep learning and the elegance of traditional techniques.
We have integrated these algorithms with an autonomous driving system.
arXiv Detail & Related papers (2020-07-16T05:02:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.