LaneTCA: Enhancing Video Lane Detection with Temporal Context Aggregation
- URL: http://arxiv.org/abs/2408.13852v1
- Date: Sun, 25 Aug 2024 14:46:29 GMT
- Title: LaneTCA: Enhancing Video Lane Detection with Temporal Context Aggregation
- Authors: Keyi Zhou, Li Li, Wengang Zhou, Yonghui Wang, Hao Feng, Houqiang Li,
- Abstract summary: LaneTCA bridges the individual video frames and explore how to effectively aggregate the temporal context.
We develop an accumulative attention module and an adjacent attention module to abstract the long-term and short-term temporal context.
The two modules are meticulously designed based on the transformer architecture.
- Score: 87.71768494466959
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In video lane detection, there are rich temporal contexts among successive frames, which is under-explored in existing lane detectors. In this work, we propose LaneTCA to bridge the individual video frames and explore how to effectively aggregate the temporal context. Technically, we develop an accumulative attention module and an adjacent attention module to abstract the long-term and short-term temporal context, respectively. The accumulative attention module continuously accumulates visual information during the journey of a vehicle, while the adjacent attention module propagates this lane information from the previous frame to the current frame. The two modules are meticulously designed based on the transformer architecture. Finally, these long-short context features are fused with the current frame features to predict the lane lines in the current frame. Extensive quantitative and qualitative experiments are conducted on two prevalent benchmark datasets. The results demonstrate the effectiveness of our method, achieving several new state-of-the-art records. The codes and models are available at https://github.com/Alex-1337/LaneTCA
Related papers
- STF: Spatio-Temporal Fusion Module for Improving Video Object Detection [7.213855322671065]
Consive frames in a video contain redundancy, but they may also contain complementary information for the detection task.
We propose a-temporal fusion framework (STF) to leverage this complementary information.
The proposed-temporal fusion module leads to improved detection performance compared to baseline object detectors.
arXiv Detail & Related papers (2024-02-16T15:19:39Z) - Efficient Long-Short Temporal Attention Network for Unsupervised Video
Object Segmentation [23.645412918420906]
Unsupervised Video Object (VOS) aims at identifying the contours of primary foreground objects in videos without any prior knowledge.
Previous methods do not fully use spatial-temporal context and fail to tackle this challenging task in real-time.
This motivates us to develop an efficient Long-Short Temporal Attention network (termed LSTA) for unsupervised VOS task from a holistic view.
arXiv Detail & Related papers (2023-09-21T01:09:46Z) - Tracking by Associating Clips [110.08925274049409]
In this paper, we investigate an alternative by treating object association as clip-wise matching.
Our new perspective views a single long video sequence as multiple short clips, and then the tracking is performed both within and between the clips.
The benefits of this new approach are two folds. First, our method is robust to tracking error accumulation or propagation, as the video chunking allows bypassing the interrupted frames.
Second, the multiple frame information is aggregated during the clip-wise matching, resulting in a more accurate long-range track association than the current frame-wise matching.
arXiv Detail & Related papers (2022-12-20T10:33:17Z) - FuTH-Net: Fusing Temporal Relations and Holistic Features for Aerial
Video Classification [49.06447472006251]
We propose a novel deep neural network, termed FuTH-Net, to model not only holistic features, but also temporal relations for aerial video classification.
Our model is evaluated on two aerial video classification datasets, ERA and Drone-Action, and achieves the state-of-the-art results.
arXiv Detail & Related papers (2022-09-22T21:15:58Z) - PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards
Video Object Detection [28.879484515844375]
We introduce a progressive way to introduce both temporal information and spatial information for an integrated enhancement.
PTSEFormer follows an end-to-end fashion to avoid heavy post-processing procedures while achieving 88.1% mAP on the ImageNet VID dataset.
arXiv Detail & Related papers (2022-09-06T06:32:57Z) - Unidirectional Video Denoising by Mimicking Backward Recurrent Modules
with Look-ahead Forward Ones [72.68740880786312]
Bidirectional recurrent networks (BiRNN) have exhibited appealing performance in several video restoration tasks.
BiRNN is intrinsically offline because it uses backward recurrent modules to propagate from the last to current frames.
We present a novel recurrent network consisting of forward and look-ahead recurrent modules for unidirectional video denoising.
arXiv Detail & Related papers (2022-04-12T05:33:15Z) - Laneformer: Object-aware Row-Column Transformers for Lane Detection [96.62919884511287]
Laneformer is a transformer-based architecture tailored for lane detection in autonomous driving.
Inspired by recent advances of the transformer encoder-decoder architecture in various vision tasks, we move forwards to design a new end-to-end Laneformer architecture.
arXiv Detail & Related papers (2022-03-18T10:14:35Z) - Implicit Motion Handling for Video Camouflaged Object Detection [60.98467179649398]
We propose a new video camouflaged object detection (VCOD) framework.
It can exploit both short-term and long-term temporal consistency to detect camouflaged objects from video frames.
arXiv Detail & Related papers (2022-03-14T17:55:41Z) - A Hybrid Spatial-temporal Deep Learning Architecture for Lane Detection [1.653688760901944]
This study proposes a novel hybrid spatial-temporal sequence-to-one deep learning architecture.
The proposed model can effectively handle challenging driving scenes and outperforms available state-of-the-art methods with a large margin.
arXiv Detail & Related papers (2021-10-05T15:47:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.