A Hybrid Spatial-temporal Deep Learning Architecture for Lane Detection
- URL: http://arxiv.org/abs/2110.04079v2
- Date: Thu, 14 Oct 2021 02:10:56 GMT
- Title: A Hybrid Spatial-temporal Deep Learning Architecture for Lane Detection
- Authors: Yongqi Dong, Sandeep Patil, Bart van Arem, Haneen Farah
- Abstract summary: This study proposes a novel hybrid spatial-temporal sequence-to-one deep learning architecture.
The proposed model can effectively handle challenging driving scenes and outperforms available state-of-the-art methods with a large margin.
- Score: 1.653688760901944
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reliable and accurate lane detection is of vital importance for the safe
performance of Lane Keeping Assistance and Lane Departure Warning systems.
However, under certain challenging peculiar circumstances, it is difficult to
get satisfactory performance in accurately detecting the lanes from one single
image which is often the case in current literature. Since lane markings are
continuous lines, the lanes that are difficult to be accurately detected in the
single current image can potentially be better deduced if information from
previous frames is incorporated. This study proposes a novel hybrid
spatial-temporal sequence-to-one deep learning architecture making full use of
the spatial-temporal information in multiple continuous image frames to detect
lane markings in the very last current frame. Specifically, the hybrid model
integrates the single image feature extraction module with the spatial
convolutional neural network (SCNN) embedded for excavating spatial features
and relationships in one single image, the spatial-temporal feature integration
module with spatial-temporal recurrent neural network (ST-RNN), which can
capture the spatial-temporal correlations and time dependencies among image
sequences, and the encoder-decoder structure, which makes this image
segmentation problem work in an end-to-end supervised learning format.
Extensive experiments reveal that the proposed model can effectively handle
challenging driving scenes and outperforms available state-of-the-art methods
with a large margin.
Related papers
- LaneTCA: Enhancing Video Lane Detection with Temporal Context Aggregation [87.71768494466959]
LaneTCA bridges the individual video frames and explore how to effectively aggregate the temporal context.
We develop an accumulative attention module and an adjacent attention module to abstract the long-term and short-term temporal context.
The two modules are meticulously designed based on the transformer architecture.
arXiv Detail & Related papers (2024-08-25T14:46:29Z) - Linear Attention is Enough in Spatial-Temporal Forecasting [0.0]
We propose treating nodes in road networks at different time steps as independent spatial-temporal tokens.
We then feed them into a vanilla Transformer to learn complex spatial-temporal patterns.
Our code achieves state-of-the-art performance at an affordable computational cost.
arXiv Detail & Related papers (2024-08-17T10:06:50Z) - TASeg: Temporal Aggregation Network for LiDAR Semantic Segmentation [80.13343299606146]
We propose a Temporal LiDAR Aggregation and Distillation (TLAD) algorithm, which leverages historical priors to assign different aggregation steps for different classes.
To make full use of temporal images, we design a Temporal Image Aggregation and Fusion (TIAF) module, which can greatly expand the camera FOV.
We also develop a Static-Moving Switch Augmentation (SMSA) algorithm, which utilizes sufficient temporal information to enable objects to switch their motion states freely.
arXiv Detail & Related papers (2024-07-13T03:00:16Z) - Homography Guided Temporal Fusion for Road Line and Marking Segmentation [73.47092021519245]
Road lines and markings are frequently occluded in the presence of moving vehicles, shadow, and glare.
We propose a Homography Guided Fusion (HomoFusion) module to exploit temporally-adjacent video frames for complementary cues.
We show that exploiting available camera intrinsic data and ground plane assumption for cross-frame correspondence can lead to a light-weight network with significantly improved performances in speed and accuracy.
arXiv Detail & Related papers (2024-04-11T10:26:40Z) - Temporal Embeddings: Scalable Self-Supervised Temporal Representation
Learning from Spatiotemporal Data for Multimodal Computer Vision [1.4127889233510498]
A novel approach is proposed to stratify landscape based on mobility activity time series.
The pixel-wise embeddings are converted to image-like channels that can be used for task-based, multimodal modeling.
arXiv Detail & Related papers (2023-10-16T02:53:29Z) - Spatio-Temporal Recurrent Networks for Event-Based Optical Flow
Estimation [47.984368369734995]
We introduce a novel recurrent encoding-decoding neural network architecture for event-based optical flow estimation.
The network is end-to-end trained with self-supervised learning on the Multi-Vehicle Stereo Event Camera dataset.
We have shown that it outperforms all the existing state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2021-09-10T13:37:37Z) - Spatial-Temporal Fusion Graph Neural Networks for Traffic Flow
Forecasting [35.072979313851235]
spatial-temporal data forecasting of traffic flow is a challenging task because of complicated spatial dependencies and dynamical trends of temporal pattern between different roads.
Existing frameworks typically utilize given spatial adjacency graph and sophisticated mechanisms for modeling spatial and temporal correlations.
This paper proposes Spatial-Temporal Fusion Graph Neural Networks (STFGNN) for traffic flow forecasting.
arXiv Detail & Related papers (2020-12-15T14:03:17Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z) - Unsupervised Monocular Depth Learning with Integrated Intrinsics and
Spatio-Temporal Constraints [61.46323213702369]
This work presents an unsupervised learning framework that is able to predict at-scale depth maps and egomotion.
Our results demonstrate strong performance when compared to the current state-of-the-art on multiple sequences of the KITTI driving dataset.
arXiv Detail & Related papers (2020-11-02T22:26:58Z) - Fast Video Salient Object Detection via Spatiotemporal Knowledge
Distillation [20.196945571479002]
We present a lightweight network tailored for video salient object detection.
Specifically, we combine a saliency guidance embedding structure and spatial knowledge distillation to refine the spatial features.
In the temporal aspect, we propose a temporal knowledge distillation strategy, which allows the network to learn the robust temporal features.
arXiv Detail & Related papers (2020-10-20T04:48:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.