A Dynamic Spatial-temporal Attention Network for Early Anticipation of
Traffic Accidents
- URL: http://arxiv.org/abs/2106.10197v1
- Date: Fri, 18 Jun 2021 15:58:53 GMT
- Title: A Dynamic Spatial-temporal Attention Network for Early Anticipation of
Traffic Accidents
- Authors: Muhammad Monjurul Karim, Yu Li, Ruwen Qin, Zhaozheng Yin
- Abstract summary: This paper presents a dynamic spatial-temporal attention (DSTA) network for early anticipation of traffic accidents from dashcam videos.
It learns to select discriminative temporal segments of a video sequence with a module named Dynamic Temporal Attention (DTA)
The spatial-temporal relational features of accidents, along with scene appearance features, are learned jointly with a Gated Recurrent Unit (GRU) network.
- Score: 12.881094474374231
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, autonomous vehicles and those equipped with an Advanced Driver
Assistance System (ADAS) are emerging. They share the road with regular ones
operated by human drivers entirely. To ensure guaranteed safety for passengers
and other road users, it becomes essential for autonomous vehicles and ADAS to
anticipate traffic accidents from natural driving scenes. The dynamic
spatial-temporal interaction of the traffic agents is complex, and visual cues
for predicting a future accident are embedded deeply in dashcam video data.
Therefore, early anticipation of traffic accidents remains a challenge. To this
end, the paper presents a dynamic spatial-temporal attention (DSTA) network for
early anticipation of traffic accidents from dashcam videos. The proposed
DSTA-network learns to select discriminative temporal segments of a video
sequence with a module named Dynamic Temporal Attention (DTA). It also learns
to focus on the informative spatial regions of frames with another module named
Dynamic Spatial Attention (DSA). The spatial-temporal relational features of
accidents, along with scene appearance features, are learned jointly with a
Gated Recurrent Unit (GRU) network. The experimental evaluation of the
DSTA-network on two benchmark datasets confirms that it has exceeded the
state-of-the-art performance. A thorough ablation study evaluates the
contributions of individual components of the DSTA-network, revealing how the
network achieves such performance. Furthermore, this paper proposes a new
strategy that fuses the prediction scores from two complementary models and
verifies its effectiveness in further boosting the performance of early
accident anticipation.
Related papers
- CRASH: Crash Recognition and Anticipation System Harnessing with Context-Aware and Temporal Focus Attentions [13.981748780317329]
Accurately and promptly predicting accidents among surrounding traffic agents from camera footage is crucial for the safety of autonomous vehicles (AVs)
This study introduces a novel accident anticipation framework for AVs, termed CRASH.
It seamlessly integrates five components: object detector, feature extractor, object-aware module, context-aware module, and multi-layer fusion.
Our model surpasses existing top baselines in critical evaluation metrics like Average Precision (AP) and mean Time-To-Accident (mTTA)
arXiv Detail & Related papers (2024-07-25T04:12:49Z) - Text-Driven Traffic Anomaly Detection with Temporal High-Frequency Modeling in Driving Videos [22.16190711818432]
We introduce TTHF, a novel single-stage method aligning video clips with text prompts, offering a new perspective on traffic anomaly detection.
Unlike previous approaches, the supervised signal of our method is derived from languages rather than one-hot vectors, providing a more comprehensive representation.
It is shown that our proposed TTHF achieves promising performance, outperforming state-of-the-art competitors by +5.4% AUC on the DoTA dataset.
arXiv Detail & Related papers (2024-01-07T15:47:19Z) - Implicit Occupancy Flow Fields for Perception and Prediction in
Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants.
Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene.
This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z) - Camera-Radar Perception for Autonomous Vehicles and ADAS: Concepts,
Datasets and Metrics [77.34726150561087]
This work aims to carry out a study on the current scenario of camera and radar-based perception for ADAS and autonomous vehicles.
Concepts and characteristics related to both sensors, as well as to their fusion, are presented.
We give an overview of the Deep Learning-based detection and segmentation tasks, and the main datasets, metrics, challenges, and open questions in vehicle perception.
arXiv Detail & Related papers (2023-03-08T00:48:32Z) - Cognitive Accident Prediction in Driving Scenes: A Multimodality
Benchmark [77.54411007883962]
We propose a Cognitive Accident Prediction (CAP) method that explicitly leverages human-inspired cognition of text description on the visual observation and the driver attention to facilitate model training.
CAP is formulated by an attentive text-to-vision shift fusion module, an attentive scene context transfer module, and the driver attention guided accident prediction module.
We construct a new large-scale benchmark consisting of 11,727 in-the-wild accident videos with over 2.19 million frames.
arXiv Detail & Related papers (2022-12-19T11:43:02Z) - An Attention-guided Multistream Feature Fusion Network for Localization
of Risky Objects in Driving Videos [10.674638266121574]
This paper proposes an attention-guided multistream feature fusion network (AM-Net) to localize dangerous traffic agents from dashcam videos.
Two Gated Recurrent Unit (GRU) networks use object bounding box and optical flow features extracted from consecutive video frames to capturetemporal cues for distinguishing dangerous traffic agents.
Fusing the two streams of features, AM-Net predicts the riskiness scores of traffic agents in the video.
arXiv Detail & Related papers (2022-09-16T13:36:28Z) - Safety-aware Motion Prediction with Unseen Vehicles for Autonomous
Driving [104.32241082170044]
We study a new task, safety-aware motion prediction with unseen vehicles for autonomous driving.
Unlike the existing trajectory prediction task for seen vehicles, we aim at predicting an occupancy map.
Our approach is the first one that can predict the existence of unseen vehicles in most cases.
arXiv Detail & Related papers (2021-09-03T13:33:33Z) - DRIVE: Deep Reinforced Accident Anticipation with Visual Explanation [36.350348194248014]
Traffic accident anticipation aims to accurately and promptly predict the occurrence of a future accident from dashcam videos.
Existing approaches typically focus on capturing the cues of spatial and temporal context before a future accident occurs.
We propose Deep ReInforced accident anticipation with Visual Explanation, named DRIVE.
arXiv Detail & Related papers (2021-07-21T16:33:21Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z) - Uncertainty-based Traffic Accident Anticipation with Spatio-Temporal
Relational Learning [30.59728753059457]
Traffic accident anticipation aims to predict accidents from dashcam videos as early as possible.
Current deterministic deep neural networks could be overconfident in false predictions.
We propose an uncertainty-based accident anticipation model with relational-temporal learning.
arXiv Detail & Related papers (2020-08-01T20:21:48Z) - A Spatial-Temporal Attentive Network with Spatial Continuity for
Trajectory Prediction [74.00750936752418]
We propose a novel model named spatial-temporal attentive network with spatial continuity (STAN-SC)
First, spatial-temporal attention mechanism is presented to explore the most useful and important information.
Second, we conduct a joint feature sequence based on the sequence and instant state information to make the generative trajectories keep spatial continuity.
arXiv Detail & Related papers (2020-03-13T04:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.