Focus on Local: Detecting Lane Marker from Bottom Up via Key Point
- URL: http://arxiv.org/abs/2105.13680v1
- Date: Fri, 28 May 2021 08:59:14 GMT
- Title: Focus on Local: Detecting Lane Marker from Bottom Up via Key Point
- Authors: Zhan Qu, Huan Jin, Yang Zhou, Zhen Yang, Wei Zhang
- Abstract summary: We propose a novel lane marker detection solution, FOLOLane, that focuses on modeling local patterns and achieving prediction of global structures.
Specifically, the CNN models lowcomplexity local patterns with two separate heads, the first one predicts the existence of key points, and the second refines the location of key points in the local range and correlates key points of the same lane line.
- Score: 10.617793053931964
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mainstream lane marker detection methods are implemented by predicting the
overall structure and deriving parametric curves through post-processing.
Complex lane line shapes require high-dimensional output of CNNs to model
global structures, which further increases the demand for model capacity and
training data. In contrast, the locality of a lane marker has finite geometric
variations and spatial coverage. We propose a novel lane marker detection
solution, FOLOLane, that focuses on modeling local patterns and achieving
prediction of global structures in a bottom-up manner. Specifically, the CNN
models lowcomplexity local patterns with two separate heads, the first one
predicts the existence of key points, and the second refines the location of
key points in the local range and correlates key points of the same lane line.
The locality of the task is consistent with the limited FOV of the feature in
CNN, which in turn leads to more stable training and better generalization. In
addition, an efficiency-oriented decoding algorithm was proposed as well as a
greedy one, which achieving 36% runtime gains at the cost of negligible
performance degradation. Both of the two decoders integrated local information
into the global geometry of lane markers. In the absence of a complex network
architecture design, the proposed method greatly outperforms all existing
methods on public datasets while achieving the best state-of-the-art results
and real-time processing simultaneously.
Related papers
- Flexible 3D Lane Detection by Hierarchical Shape MatchingFlexible 3D Lane Detection by Hierarchical Shape Matching [29.038755629481035]
3D lane detection is still an open problem due to varying visual conditions, complex typologies, and strict demands for precision.
In this paper, an end-to-end flexible and hierarchical lane detector is proposed to precisely predict 3D lane lines from point clouds.
arXiv Detail & Related papers (2024-08-13T19:04:23Z) - Double-Shot 3D Shape Measurement with a Dual-Branch Network [14.749887303860717]
We propose a dual-branch Convolutional Neural Network (CNN)-Transformer network (PDCNet) to process different structured light (SL) modalities.
Within PDCNet, a Transformer branch is used to capture global perception in the fringe images, while a CNN branch is designed to collect local details in the speckle images.
We show that our method can reduce fringe order ambiguity while producing high-accuracy results on a self-made dataset.
arXiv Detail & Related papers (2024-07-19T10:49:26Z) - Mesh Denoising Transformer [104.5404564075393]
Mesh denoising is aimed at removing noise from input meshes while preserving their feature structures.
SurfaceFormer is a pioneering Transformer-based mesh denoising framework.
New representation known as Local Surface Descriptor captures local geometric intricacies.
Denoising Transformer module receives the multimodal information and achieves efficient global feature aggregation.
arXiv Detail & Related papers (2024-05-10T15:27:43Z) - 3D Lane Detection from Front or Surround-View using Joint-Modeling & Matching [27.588395086563978]
We propose a joint modeling approach that combines Bezier curves and methods.
We also introduce a novel 3D Spatial, representing an exploration of 3D surround-view lane detection research.
This innovative method establishes a new benchmark in front-view 3D lane detection on the Openlane dataset.
arXiv Detail & Related papers (2024-01-16T01:12:24Z) - DETR Doesn't Need Multi-Scale or Locality Design [69.56292005230185]
This paper presents an improved DETR detector that maintains a "plain" nature.
It uses a single-scale feature map and global cross-attention calculations without specific locality constraints.
We show that two simple technologies are surprisingly effective within a plain design to compensate for the lack of multi-scale feature maps and locality constraints.
arXiv Detail & Related papers (2023-08-03T17:59:04Z) - Global-to-Local Modeling for Video-based 3D Human Pose and Shape
Estimation [53.04781510348416]
Video-based 3D human pose and shape estimations are evaluated by intra-frame accuracy and inter-frame smoothness.
We propose to structurally decouple the modeling of long-term and short-term correlations in an end-to-end framework, Global-to-Local Transformer (GLoT)
Our GLoT surpasses previous state-of-the-art methods with the lowest model parameters on popular benchmarks, i.e., 3DPW, MPI-INF-3DHP, and Human3.6M.
arXiv Detail & Related papers (2023-03-26T14:57:49Z) - Centralized Feature Pyramid for Object Detection [53.501796194901964]
Visual feature pyramid has shown its superiority in both effectiveness and efficiency in a wide range of applications.
In this paper, we propose a OLO Feature Pyramid for object detection, which is based on a globally explicit centralized feature regulation.
arXiv Detail & Related papers (2022-10-05T08:32:54Z) - Cross-modal Local Shortest Path and Global Enhancement for
Visible-Thermal Person Re-Identification [2.294635424666456]
We propose the Cross-modal Local Shortest Path and Global Enhancement (CM-LSP-GE) modules,a two-stream network based on joint learning of local and global features.
The experimental results on two typical datasets show that our model is obviously superior to the most state-of-the-art methods.
arXiv Detail & Related papers (2022-06-09T10:27:22Z) - TC-Net: Triple Context Network for Automated Stroke Lesion Segmentation [0.5482532589225552]
We propose a new network, Triple Context Network (TC-Net), with the capture of spatial contextual information as the core.
Our network is evaluated on the open dataset ATLAS, achieving the highest score of 0.594, Hausdorff distance of 27.005 mm, and average symmetry surface distance of 7.137 mm.
arXiv Detail & Related papers (2022-02-28T11:12:16Z) - PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object
Detection [57.49788100647103]
LiDAR-based 3D object detection is an important task for autonomous driving.
Current approaches suffer from sparse and partial point clouds of distant and occluded objects.
In this paper, we propose a novel two-stage approach, namely PC-RGNN, dealing with such challenges by two specific solutions.
arXiv Detail & Related papers (2020-12-18T18:06:43Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.