RoadPainter: Points Are Ideal Navigators for Topology transformER
- URL: http://arxiv.org/abs/2407.15349v1
- Date: Mon, 22 Jul 2024 03:23:35 GMT
- Title: RoadPainter: Points Are Ideal Navigators for Topology transformER
- Authors: Zhongxing Ma, Shuang Liang, Yongkun Wen, Weixin Lu, Guowei Wan,
- Abstract summary: Topology reasoning aims to provide a precise understanding of road scenes, enabling autonomous systems to identify safe and efficient routes.
We present RoadPainter, an innovative approach for detecting and reasoning the topology of lane centerlines using multi-view images.
- Score: 10.179711440042123
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Topology reasoning aims to provide a precise understanding of road scenes, enabling autonomous systems to identify safe and efficient routes. In this paper, we present RoadPainter, an innovative approach for detecting and reasoning the topology of lane centerlines using multi-view images. The core concept behind RoadPainter is to extract a set of points from each centerline mask to improve the accuracy of centerline prediction. We start by implementing a transformer decoder that integrates a hybrid attention mechanism and a real-virtual separation strategy to predict coarse lane centerlines and establish topological associations. Then, we generate centerline instance masks guided by the centerline points from the transformer decoder. Moreover, we derive an additional set of points from each mask and combine them with previously detected centerline points for further refinement. Additionally, we introduce an optional module that incorporates a Standard Definition (SD) map to further optimize centerline detection and enhance topological reasoning performance. Experimental evaluations on the OpenLane-V2 dataset demonstrate the state-of-the-art performance of RoadPainter.
Related papers
- TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior [70.84644266024571]
We propose to train a perception model to "see" standard definition maps (SDMaps)
We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information.
Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology.
arXiv Detail & Related papers (2024-11-22T06:13:42Z) - Neural Semantic Map-Learning for Autonomous Vehicles [85.8425492858912]
We present a mapping system that fuses local submaps gathered from a fleet of vehicles at a central instance to produce a coherent map of the road environment.
Our method jointly aligns and merges the noisy and incomplete local submaps using a scene-specific Neural Signed Distance Field.
We leverage memory-efficient sparse feature-grids to scale to large areas and introduce a confidence score to model uncertainty in scene reconstruction.
arXiv Detail & Related papers (2024-10-10T10:10:03Z) - LMT-Net: Lane Model Transformer Network for Automated HD Mapping from Sparse Vehicle Observations [11.395749549636868]
Lane Model Transformer Network (LMT-Net) is an encoder-decoder neural network architecture that performs polyline encoding and predicts lane pairs and their connectivity.
We evaluate the performance of LMT-Net on an internal dataset that consists of multiple vehicle observations as well as human annotations as Ground Truth (GT)
arXiv Detail & Related papers (2024-09-19T02:14:35Z) - Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - LineMarkNet: Line Landmark Detection for Valet Parking [13.563702256927135]
We develop a deep network (LineMarkNet) to detect line landmarks from surround-view cameras.
We then employ the multi-task decoder to detect multiple line landmarks.
Experimental results show that our framework achieves the enhanced performance compared with several line detection methods.
arXiv Detail & Related papers (2023-09-19T09:43:29Z) - TopoMask: Instance-Mask-Based Formulation for the Road Topology Problem
via Transformer-Based Architecture [4.970364068620607]
We introduce TopoMask for predicting centerlines in road topology.
TopoMask has ranked 4th in the OpenLane-V2 Score (OLS) and 2nd in the F1 score of centerline prediction in OpenLane Topology Challenge 2023.
arXiv Detail & Related papers (2023-06-08T17:58:57Z) - Semantic Segmentation of Radar Detections using Convolutions on Point
Clouds [59.45414406974091]
We introduce a deep-learning based method to convolve radar detections into point clouds.
We adapt this algorithm to radar-specific properties through distance-dependent clustering and pre-processing of input point clouds.
Our network outperforms state-of-the-art approaches that are based on PointNet++ on the task of semantic segmentation of radar point clouds.
arXiv Detail & Related papers (2023-05-22T07:09:35Z) - PaRK-Detect: Towards Efficient Multi-Task Satellite Imagery Road
Extraction via Patch-Wise Keypoints Detection [12.145321599949236]
We propose a new scheme for multi-task satellite imagery road extraction, Patch-wise Road Keypoints Detection (PaRK-Detect)
Our framework predicts the position of patch-wise road keypoints and the adjacent relationships between them to construct road graphs in a single pass.
We evaluate our approach against the existing state-of-the-art methods on DeepGlobe, Massachusetts Roads, and RoadTracer datasets and achieve competitive or better results.
arXiv Detail & Related papers (2023-02-26T08:26:26Z) - Monocular BEV Perception of Road Scenes via Front-to-Top View Projection [57.19891435386843]
We present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view.
Our model runs at 25 FPS on a single GPU, which is efficient and applicable for real-time panorama HD map reconstruction.
arXiv Detail & Related papers (2022-11-15T13:52:41Z) - Laneformer: Object-aware Row-Column Transformers for Lane Detection [96.62919884511287]
Laneformer is a transformer-based architecture tailored for lane detection in autonomous driving.
Inspired by recent advances of the transformer encoder-decoder architecture in various vision tasks, we move forwards to design a new end-to-end Laneformer architecture.
arXiv Detail & Related papers (2022-03-18T10:14:35Z) - Detecting Lane and Road Markings at A Distance with Perspective
Transformer Layers [5.033948921121557]
In existing approaches, the detection accuracy often degrades with the increasing distance.
This is due to the fact that distant lane and road markings occupy a small number of pixels in the image.
Inverse Perspective Mapping can be used to eliminate the perspective distortion, but the inherent can lead to artifacts.
arXiv Detail & Related papers (2020-03-19T03:22:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.