RoadPainter: Points Are Ideal Navigators for Topology transformER
- URL: http://arxiv.org/abs/2407.15349v1
- Date: Mon, 22 Jul 2024 03:23:35 GMT
- Title: RoadPainter: Points Are Ideal Navigators for Topology transformER
- Authors: Zhongxing Ma, Shuang Liang, Yongkun Wen, Weixin Lu, Guowei Wan,
- Abstract summary: Topology reasoning aims to provide a precise understanding of road scenes, enabling autonomous systems to identify safe and efficient routes.
We present RoadPainter, an innovative approach for detecting and reasoning the topology of lane centerlines using multi-view images.
- Score: 10.179711440042123
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Topology reasoning aims to provide a precise understanding of road scenes, enabling autonomous systems to identify safe and efficient routes. In this paper, we present RoadPainter, an innovative approach for detecting and reasoning the topology of lane centerlines using multi-view images. The core concept behind RoadPainter is to extract a set of points from each centerline mask to improve the accuracy of centerline prediction. We start by implementing a transformer decoder that integrates a hybrid attention mechanism and a real-virtual separation strategy to predict coarse lane centerlines and establish topological associations. Then, we generate centerline instance masks guided by the centerline points from the transformer decoder. Moreover, we derive an additional set of points from each mask and combine them with previously detected centerline points for further refinement. Additionally, we introduce an optional module that incorporates a Standard Definition (SD) map to further optimize centerline detection and enhance topological reasoning performance. Experimental evaluations on the OpenLane-V2 dataset demonstrate the state-of-the-art performance of RoadPainter.
Related papers
- Homography Guided Temporal Fusion for Road Line and Marking Segmentation [73.47092021519245]
Road lines and markings are frequently occluded in the presence of moving vehicles, shadow, and glare.
We propose a Homography Guided Fusion (HomoFusion) module to exploit temporally-adjacent video frames for complementary cues.
We show that exploiting available camera intrinsic data and ground plane assumption for cross-frame correspondence can lead to a light-weight network with significantly improved performances in speed and accuracy.
arXiv Detail & Related papers (2024-04-11T10:26:40Z) - LanePtrNet: Revisiting Lane Detection as Point Voting and Grouping on
Curves [8.037214110171123]
Lane detection plays a critical role in the field of autonomous driving.
We propose a novel approach, LanePtrNet, which treats lane detection as a process of point voting and grouping on ordered sets.
We conduct comprehensive experiments to validate the effectiveness of our proposed approach, demonstrating its superior performance.
arXiv Detail & Related papers (2024-03-08T08:45:42Z) - LineMarkNet: Line Landmark Detection for Valet Parking [13.563702256927135]
We develop a deep network (LineMarkNet) to detect line landmarks from surround-view cameras.
We then employ the multi-task decoder to detect multiple line landmarks.
Experimental results show that our framework achieves the enhanced performance compared with several line detection methods.
arXiv Detail & Related papers (2023-09-19T09:43:29Z) - TopoMask: Instance-Mask-Based Formulation for the Road Topology Problem
via Transformer-Based Architecture [4.970364068620607]
We introduce TopoMask for predicting centerlines in road topology.
TopoMask has ranked 4th in the OpenLane-V2 Score (OLS) and 2nd in the F1 score of centerline prediction in OpenLane Topology Challenge 2023.
arXiv Detail & Related papers (2023-06-08T17:58:57Z) - Semantic Segmentation of Radar Detections using Convolutions on Point
Clouds [59.45414406974091]
We introduce a deep-learning based method to convolve radar detections into point clouds.
We adapt this algorithm to radar-specific properties through distance-dependent clustering and pre-processing of input point clouds.
Our network outperforms state-of-the-art approaches that are based on PointNet++ on the task of semantic segmentation of radar point clouds.
arXiv Detail & Related papers (2023-05-22T07:09:35Z) - PaRK-Detect: Towards Efficient Multi-Task Satellite Imagery Road
Extraction via Patch-Wise Keypoints Detection [12.145321599949236]
We propose a new scheme for multi-task satellite imagery road extraction, Patch-wise Road Keypoints Detection (PaRK-Detect)
Our framework predicts the position of patch-wise road keypoints and the adjacent relationships between them to construct road graphs in a single pass.
We evaluate our approach against the existing state-of-the-art methods on DeepGlobe, Massachusetts Roads, and RoadTracer datasets and achieve competitive or better results.
arXiv Detail & Related papers (2023-02-26T08:26:26Z) - Monocular BEV Perception of Road Scenes via Front-to-Top View Projection [57.19891435386843]
We present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view.
Our model runs at 25 FPS on a single GPU, which is efficient and applicable for real-time panorama HD map reconstruction.
arXiv Detail & Related papers (2022-11-15T13:52:41Z) - SurroundDepth: Entangling Surrounding Views for Self-Supervised
Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views.
In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z) - Laneformer: Object-aware Row-Column Transformers for Lane Detection [96.62919884511287]
Laneformer is a transformer-based architecture tailored for lane detection in autonomous driving.
Inspired by recent advances of the transformer encoder-decoder architecture in various vision tasks, we move forwards to design a new end-to-end Laneformer architecture.
arXiv Detail & Related papers (2022-03-18T10:14:35Z) - Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for
Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties.
Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates.
The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z) - Detecting Lane and Road Markings at A Distance with Perspective
Transformer Layers [5.033948921121557]
In existing approaches, the detection accuracy often degrades with the increasing distance.
This is due to the fact that distant lane and road markings occupy a small number of pixels in the image.
Inverse Perspective Mapping can be used to eliminate the perspective distortion, but the inherent can lead to artifacts.
arXiv Detail & Related papers (2020-03-19T03:22:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.