Focus on Local: Detecting Lane Marker from Bottom Up via Key Point
- URL: http://arxiv.org/abs/2105.13680v1
- Date: Fri, 28 May 2021 08:59:14 GMT
- Title: Focus on Local: Detecting Lane Marker from Bottom Up via Key Point
- Authors: Zhan Qu, Huan Jin, Yang Zhou, Zhen Yang, Wei Zhang
- Abstract summary: We propose a novel lane marker detection solution, FOLOLane, that focuses on modeling local patterns and achieving prediction of global structures.
Specifically, the CNN models lowcomplexity local patterns with two separate heads, the first one predicts the existence of key points, and the second refines the location of key points in the local range and correlates key points of the same lane line.
- Score: 10.617793053931964
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mainstream lane marker detection methods are implemented by predicting the
overall structure and deriving parametric curves through post-processing.
Complex lane line shapes require high-dimensional output of CNNs to model
global structures, which further increases the demand for model capacity and
training data. In contrast, the locality of a lane marker has finite geometric
variations and spatial coverage. We propose a novel lane marker detection
solution, FOLOLane, that focuses on modeling local patterns and achieving
prediction of global structures in a bottom-up manner. Specifically, the CNN
models lowcomplexity local patterns with two separate heads, the first one
predicts the existence of key points, and the second refines the location of
key points in the local range and correlates key points of the same lane line.
The locality of the task is consistent with the limited FOV of the feature in
CNN, which in turn leads to more stable training and better generalization. In
addition, an efficiency-oriented decoding algorithm was proposed as well as a
greedy one, which achieving 36% runtime gains at the cost of negligible
performance degradation. Both of the two decoders integrated local information
into the global geometry of lane markers. In the absence of a complex network
architecture design, the proposed method greatly outperforms all existing
methods on public datasets while achieving the best state-of-the-art results
and real-time processing simultaneously.
Related papers
- UAGLNet: Uncertainty-Aggregated Global-Local Fusion Network with Cooperative CNN-Transformer for Building Extraction [83.48950950780554]
Building extraction from remote sensing images is a challenging task due to the complex structure variations of buildings.<n>Existing methods employ convolutional or self-attention blocks to capture the multi-scale features in the segmentation models.<n>We present an Uncertainty-Aggregated Global-Local Fusion Network (UAGLNet) to exploit high-quality global-local visual semantics.
arXiv Detail & Related papers (2025-12-15T02:59:16Z) - Generative MIMO Beam Map Construction for Location Recovery and Beam Tracking [67.65578956523403]
This paper proposes a generative framework to recover location labels directly from sparse channel state information (CSI) measurements.<n>Instead of directly storing raw CSI, we learn a compact low-dimensional radio map embedding and leverage a generative model to reconstruct the high-dimensional CSI.<n> Numerical experiments demonstrate that the proposed model can improve localization accuracy by over 30% and achieve a 20% capacity gain in non-line-of-sight (NLOS) scenarios.
arXiv Detail & Related papers (2025-11-21T07:25:49Z) - InterKey: Cross-modal Intersection Keypoints for Global Localization on OpenStreetMap [7.975038003192725]
OpenStreetMap (OSM) offers a free and globally available alternative, but its coarse abstraction poses challenges for matching with sensor data.<n>We propose InterKey, a cross-modal framework that leverages road intersections as distinctive landmarks for global localization.<n>Our method constructs compact binary descriptors by jointly encoding road and building imprints from point clouds and OSM.
arXiv Detail & Related papers (2025-09-17T09:46:57Z) - A Novel Local Focusing Mechanism for Deepfake Detection Generalization [10.223643897131192]
Deepfake generation techniques have intensified the need for robust and generalizable detection methods.<n>We propose a novel Local Focus Mechanism (LFM) that explicitly attends to discriminative local features for differentiating fake from real images.<n>LFM achieves a 3.7 improvement in accuracy and a 2.8 increase in average precision over the state-of-the-art Neighboring Pixel Relationships (NPR) method.
arXiv Detail & Related papers (2025-08-23T14:06:30Z) - Topology-Aware Modeling for Unsupervised Simulation-to-Reality Point Cloud Recognition [63.55828203989405]
We introduce a novel Topology-Aware Modeling (TAM) framework for Sim2Real UDA on object point clouds.<n>Our approach mitigates the domain gap by leveraging global spatial topology, characterized by low-level, high-frequency 3D structures.<n>We propose an advanced self-training strategy that combines cross-domain contrastive learning with self-training.
arXiv Detail & Related papers (2025-06-26T11:53:59Z) - GLane3D : Detecting Lanes with Graph of 3D Keypoints [1.7751300245073598]
We propose a method that detects keypoints of lanes and subsequently predicts sequential connections between them to construct 3D lanes.
PointNMS is employed to eliminate overlapping proposal keypoints, reducing redundancy in the estimated BEV graph.
Our model surpasses previous state-of-the-art methods on both the Apollo and OpenLane datasets, demonstrating superior F1 scores and a strong generalization capacity.
arXiv Detail & Related papers (2025-03-31T09:33:26Z) - Flexible 3D Lane Detection by Hierarchical Shape MatchingFlexible 3D Lane Detection by Hierarchical Shape Matching [29.038755629481035]
3D lane detection is still an open problem due to varying visual conditions, complex typologies, and strict demands for precision.
In this paper, an end-to-end flexible and hierarchical lane detector is proposed to precisely predict 3D lane lines from point clouds.
arXiv Detail & Related papers (2024-08-13T19:04:23Z) - Double-Shot 3D Shape Measurement with a Dual-Branch Network [14.749887303860717]
We propose a dual-branch Convolutional Neural Network (CNN)-Transformer network (PDCNet) to process different structured light (SL) modalities.
Within PDCNet, a Transformer branch is used to capture global perception in the fringe images, while a CNN branch is designed to collect local details in the speckle images.
We show that our method can reduce fringe order ambiguity while producing high-accuracy results on a self-made dataset.
arXiv Detail & Related papers (2024-07-19T10:49:26Z) - Mesh Denoising Transformer [104.5404564075393]
Mesh denoising is aimed at removing noise from input meshes while preserving their feature structures.
SurfaceFormer is a pioneering Transformer-based mesh denoising framework.
New representation known as Local Surface Descriptor captures local geometric intricacies.
Denoising Transformer module receives the multimodal information and achieves efficient global feature aggregation.
arXiv Detail & Related papers (2024-05-10T15:27:43Z) - 3D Lane Detection from Front or Surround-View using Joint-Modeling & Matching [27.588395086563978]
We propose a joint modeling approach that combines Bezier curves and methods.
We also introduce a novel 3D Spatial, representing an exploration of 3D surround-view lane detection research.
This innovative method establishes a new benchmark in front-view 3D lane detection on the Openlane dataset.
arXiv Detail & Related papers (2024-01-16T01:12:24Z) - DETR Doesn't Need Multi-Scale or Locality Design [69.56292005230185]
This paper presents an improved DETR detector that maintains a "plain" nature.
It uses a single-scale feature map and global cross-attention calculations without specific locality constraints.
We show that two simple technologies are surprisingly effective within a plain design to compensate for the lack of multi-scale feature maps and locality constraints.
arXiv Detail & Related papers (2023-08-03T17:59:04Z) - Global-to-Local Modeling for Video-based 3D Human Pose and Shape
Estimation [53.04781510348416]
Video-based 3D human pose and shape estimations are evaluated by intra-frame accuracy and inter-frame smoothness.
We propose to structurally decouple the modeling of long-term and short-term correlations in an end-to-end framework, Global-to-Local Transformer (GLoT)
Our GLoT surpasses previous state-of-the-art methods with the lowest model parameters on popular benchmarks, i.e., 3DPW, MPI-INF-3DHP, and Human3.6M.
arXiv Detail & Related papers (2023-03-26T14:57:49Z) - Centralized Feature Pyramid for Object Detection [53.501796194901964]
Visual feature pyramid has shown its superiority in both effectiveness and efficiency in a wide range of applications.
In this paper, we propose a OLO Feature Pyramid for object detection, which is based on a globally explicit centralized feature regulation.
arXiv Detail & Related papers (2022-10-05T08:32:54Z) - Cross-modal Local Shortest Path and Global Enhancement for
Visible-Thermal Person Re-Identification [2.294635424666456]
We propose the Cross-modal Local Shortest Path and Global Enhancement (CM-LSP-GE) modules,a two-stream network based on joint learning of local and global features.
The experimental results on two typical datasets show that our model is obviously superior to the most state-of-the-art methods.
arXiv Detail & Related papers (2022-06-09T10:27:22Z) - TC-Net: Triple Context Network for Automated Stroke Lesion Segmentation [0.5482532589225552]
We propose a new network, Triple Context Network (TC-Net), with the capture of spatial contextual information as the core.
Our network is evaluated on the open dataset ATLAS, achieving the highest score of 0.594, Hausdorff distance of 27.005 mm, and average symmetry surface distance of 7.137 mm.
arXiv Detail & Related papers (2022-02-28T11:12:16Z) - PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object
Detection [57.49788100647103]
LiDAR-based 3D object detection is an important task for autonomous driving.
Current approaches suffer from sparse and partial point clouds of distant and occluded objects.
In this paper, we propose a novel two-stage approach, namely PC-RGNN, dealing with such challenges by two specific solutions.
arXiv Detail & Related papers (2020-12-18T18:06:43Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.