Efficient Sequential Neural Network with Spatial-Temporal Attention and Linear LSTM for Robust Lane Detection Using Multi-Frame Images
- URL: http://arxiv.org/abs/2602.03669v1
- Date: Tue, 03 Feb 2026 15:51:29 GMT
- Title: Efficient Sequential Neural Network with Spatial-Temporal Attention and Linear LSTM for Robust Lane Detection Using Multi-Frame Images
- Authors: Sandeep Patil, Yongqi Dong, Haneen Farah, Hans Hellendoorn,
- Abstract summary: Lane detection is a crucial perception task for automated vehicles (AVs) and Advanced Driver Assistance Systems.<n>Current methods lack versatility in delivering accurate, robust, and real-time compatible lane detection.<n>This study introduces a novel sequential neural network model with a spatial-temporal attention mechanism to focus on key features of lane lines.
- Score: 3.8825198843426345
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Lane detection is a crucial perception task for all levels of automated vehicles (AVs) and Advanced Driver Assistance Systems, particularly in mixed-traffic environments where AVs must interact with human-driven vehicles (HDVs) and challenging traffic scenarios. Current methods lack versatility in delivering accurate, robust, and real-time compatible lane detection, especially vision-based methods often neglect critical regions of the image and their spatial-temporal (ST) salience, leading to poor performance in difficult circumstances such as serious occlusion and dazzle lighting. This study introduces a novel sequential neural network model with a spatial-temporal attention mechanism to focus on key features of lane lines and exploit salient ST correlations among continuous image frames. The proposed model, built on a standard encoder-decoder structure and common neural network backbones, is trained and evaluated on three large-scale open-source datasets. Extensive experiments demonstrate the strength and robustness of the proposed model, outperforming state-of-the-art methods in various testing scenarios. Furthermore, with the ST attention mechanism, the developed sequential neural network models exhibit fewer parameters and reduced Multiply-Accumulate Operations (MACs) compared to baseline sequential models, highlighting their computational efficiency. Relevant data, code, and models are released at https://doi.org/10.4121/4619cab6-ae4a-40d5-af77-582a77f3d821.
Related papers
- Surveillance Video-Based Traffic Accident Detection Using Transformer Architecture [2.621034368312571]
Traffic accidents represent a leading cause of mortality globally, with incidence rates due to increasing population, urbanization and motorization.<n>Traditional computer methods for accident detection struggle with limited understanding and poor cross-domain generalization.<n>We propose an accident detection model based on a transformer architecture using pre-extracted spatial video features.
arXiv Detail & Related papers (2025-12-12T07:57:36Z) - Visual Dominance and Emerging Multimodal Approaches in Distracted Driving Detection: A Review of Machine Learning Techniques [3.378738346115004]
Distracted driving continues to be a significant cause of road traffic injuries and fatalities worldwide.<n>Recent developments in machine learning (ML) and deep learning (DL) have primarily focused on visual data to detect distraction.<n>This systematic review assesses 74 studies that utilize ML/DL techniques for distracted driving detection across visual, sensor-based, multimodal, and emerging modalities.
arXiv Detail & Related papers (2025-05-04T02:51:00Z) - SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data Pretraining [62.433137130087445]
SuperFlow++ is a novel framework that integrates pretraining and downstream tasks using consecutive camera pairs.<n>We show that SuperFlow++ outperforms state-of-the-art methods across diverse tasks and driving conditions.<n>With strong generalizability and computational efficiency, SuperFlow++ establishes a new benchmark for data-efficient LiDAR-based perception in autonomous driving.
arXiv Detail & Related papers (2025-03-25T17:59:57Z) - Estimating Vehicle Speed on Roadways Using RNNs and Transformers: A Video-based Approach [0.0]
This project explores the application of advanced machine learning models, specifically Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), and Transformers, to the task of vehicle speed estimation using video data.
arXiv Detail & Related papers (2025-02-21T15:51:49Z) - Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - A Novel Spike Transformer Network for Depth Estimation from Event Cameras via Cross-modality Knowledge Distillation [3.355813093377501]
Event cameras encode temporal changes in light intensity as asynchronous binary spikes.<n>Their unconventional spiking output and the scarcity of labelled datasets pose significant challenges to traditional image-based depth estimation methods.<n>We propose a novel energy-efficient Spike-Driven Transformer Network (SDT) for depth estimation, leveraging the unique properties of spiking data.
arXiv Detail & Related papers (2024-04-26T11:32:53Z) - Self-STORM: Deep Unrolled Self-Supervised Learning for Super-Resolution Microscopy [55.2480439325792]
We introduce deep unrolled self-supervised learning, which alleviates the need for such data by training a sequence-specific, model-based autoencoder.
Our proposed method exceeds the performance of its supervised counterparts.
arXiv Detail & Related papers (2024-03-25T17:40:32Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - STC-IDS: Spatial-Temporal Correlation Feature Analyzing based Intrusion
Detection System for Intelligent Connected Vehicles [7.301018758489822]
We present a novel model for automotive intrusion detection by spatial-temporal correlation features of in-vehicle communication traffic (STC-IDS)
Specifically, the proposed model exploits an encoding-detection architecture. In the encoder part, spatial and temporal relations are encoded simultaneously.
The encoded information is then passed to the detector for generating forceful spatial-temporal attention features and enabling anomaly classification.
arXiv Detail & Related papers (2022-04-23T04:22:58Z) - Spatio-Temporal Look-Ahead Trajectory Prediction using Memory Neural
Network [6.065344547161387]
This paper attempts to solve the problem of Spatio-temporal look-ahead trajectory prediction using a novel recurrent neural network called the Memory Neuron Network.
The proposed model is computationally less intensive and has a simple architecture as compared to other deep learning models that utilize LSTMs and GRUs.
arXiv Detail & Related papers (2021-02-24T05:02:19Z) - Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for
Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties.
Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates.
The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z) - A Generative Learning Approach for Spatio-temporal Modeling in Connected
Vehicular Network [55.852401381113786]
This paper proposes LaMI (Latency Model Inpainting), a novel framework to generate a comprehensive-temporal quality framework for wireless access latency of connected vehicles.
LaMI adopts the idea from image inpainting and synthesizing and can reconstruct the missing latency samples by a two-step procedure.
In particular, it first discovers the spatial correlation between samples collected in various regions using a patching-based approach and then feeds the original and highly correlated samples into a Varienational Autocoder (VAE)
arXiv Detail & Related papers (2020-03-16T03:43:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.