A Memory-Augmented Multi-Task Collaborative Framework for Unsupervised
Traffic Accident Detection in Driving Videos
- URL: http://arxiv.org/abs/2307.14575v1
- Date: Thu, 27 Jul 2023 01:45:13 GMT
- Title: A Memory-Augmented Multi-Task Collaborative Framework for Unsupervised
Traffic Accident Detection in Driving Videos
- Authors: Rongqin Liang, Yuanman Li, Yingxin Yi, Jiantao Zhou, Xia Li
- Abstract summary: We propose a novel memory-augmented multi-task collaborative framework (MAMTCF) for unsupervised traffic accident detection in driving videos.
Our method can more accurately detect both ego-involved and non-ego accidents by simultaneously modeling appearance changes and object motions in video frames.
- Score: 22.553356096143734
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Identifying traffic accidents in driving videos is crucial to ensuring the
safety of autonomous driving and driver assistance systems. To address the
potential danger caused by the long-tailed distribution of driving events,
existing traffic accident detection (TAD) methods mainly rely on unsupervised
learning. However, TAD is still challenging due to the rapid movement of
cameras and dynamic scenes in driving scenarios. Existing unsupervised TAD
methods mainly rely on a single pretext task, i.e., an appearance-based or
future object localization task, to detect accidents. However, appearance-based
approaches are easily disturbed by the rapid movement of the camera and changes
in illumination, which significantly reduce the performance of traffic accident
detection. Methods based on future object localization may fail to capture
appearance changes in video frames, making it difficult to detect ego-involved
accidents (e.g., out of control of the ego-vehicle). In this paper, we propose
a novel memory-augmented multi-task collaborative framework (MAMTCF) for
unsupervised traffic accident detection in driving videos. Different from
previous approaches, our method can more accurately detect both ego-involved
and non-ego accidents by simultaneously modeling appearance changes and object
motions in video frames through the collaboration of optical flow
reconstruction and future object localization tasks. Further, we introduce a
memory-augmented motion representation mechanism to fully explore the
interrelation between different types of motion representations and exploit the
high-level features of normal traffic patterns stored in memory to augment
motion representations, thus enlarging the difference from anomalies.
Experimental results on recently published large-scale dataset demonstrate that
our method achieves better performance compared to previous state-of-the-art
approaches.
Related papers
- Text-Driven Traffic Anomaly Detection with Temporal High-Frequency Modeling in Driving Videos [22.16190711818432]
We introduce TTHF, a novel single-stage method aligning video clips with text prompts, offering a new perspective on traffic anomaly detection.
Unlike previous approaches, the supervised signal of our method is derived from languages rather than one-hot vectors, providing a more comprehensive representation.
It is shown that our proposed TTHF achieves promising performance, outperforming state-of-the-art competitors by +5.4% AUC on the DoTA dataset.
arXiv Detail & Related papers (2024-01-07T15:47:19Z) - DRUformer: Enhancing the driving scene Important object detection with
driving relationship self-understanding [50.81809690183755]
Traffic accidents frequently lead to fatal injuries, contributing to over 50 million deaths until 2023.
Previous research primarily assessed the importance of individual participants, treating them as independent entities.
We introduce Driving scene Relationship self-Understanding transformer (DRUformer) to enhance the important object detection task.
arXiv Detail & Related papers (2023-11-11T07:26:47Z) - Unsupervised Domain Adaptation for Self-Driving from Past Traversal
Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments.
Our approach enhances LiDAR-based detection models using spatial quantized historical features.
Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z) - Cognitive Accident Prediction in Driving Scenes: A Multimodality
Benchmark [77.54411007883962]
We propose a Cognitive Accident Prediction (CAP) method that explicitly leverages human-inspired cognition of text description on the visual observation and the driver attention to facilitate model training.
CAP is formulated by an attentive text-to-vision shift fusion module, an attentive scene context transfer module, and the driver attention guided accident prediction module.
We construct a new large-scale benchmark consisting of 11,727 in-the-wild accident videos with over 2.19 million frames.
arXiv Detail & Related papers (2022-12-19T11:43:02Z) - TAD: A Large-Scale Benchmark for Traffic Accidents Detection from Video
Surveillance [2.1076255329439304]
Existing datasets in traffic accidents are either small-scale, not from surveillance cameras, not open-sourced, or not built for freeway scenes.
After integration and annotation by various dimensions, a large-scale traffic accidents dataset named TAD is proposed in this work.
arXiv Detail & Related papers (2022-09-26T03:00:50Z) - Real-Time Accident Detection in Traffic Surveillance Using Deep Learning [0.8808993671472349]
This paper presents a new efficient framework for accident detection at intersections for traffic surveillance applications.
The proposed framework consists of three hierarchical steps, including efficient and accurate object detection based on the state-of-the-art YOLOv4 method.
The robustness of the proposed framework is evaluated using video sequences collected from YouTube with diverse illumination conditions.
arXiv Detail & Related papers (2022-08-12T19:07:20Z) - Implicit Motion Handling for Video Camouflaged Object Detection [60.98467179649398]
We propose a new video camouflaged object detection (VCOD) framework.
It can exploit both short-term and long-term temporal consistency to detect camouflaged objects from video frames.
arXiv Detail & Related papers (2022-03-14T17:55:41Z) - DRIVE: Deep Reinforced Accident Anticipation with Visual Explanation [36.350348194248014]
Traffic accident anticipation aims to accurately and promptly predict the occurrence of a future accident from dashcam videos.
Existing approaches typically focus on capturing the cues of spatial and temporal context before a future accident occurs.
We propose Deep ReInforced accident anticipation with Visual Explanation, named DRIVE.
arXiv Detail & Related papers (2021-07-21T16:33:21Z) - Multi-Modal Fusion Transformer for End-to-End Autonomous Driving [59.60483620730437]
We propose TransFuser, a novel Multi-Modal Fusion Transformer, to integrate image and LiDAR representations using attention.
Our approach achieves state-of-the-art driving performance while reducing collisions by 76% compared to geometry-based fusion.
arXiv Detail & Related papers (2021-04-19T11:48:13Z) - Vehicle-Human Interactive Behaviors in Emergency: Data Extraction from
Traffic Accident Videos [0.0]
Currently, studying the vehicle-human interactive behavior in the emergency needs a large amount of datasets in the actual emergent situations that are almost unavailable.
This paper provides a new yet convenient way to extract the interactive behavior data (i.e., the trajectories of vehicles and humans) from actual accident videos.
The main challenge for data extraction from real-time accident video lies in the fact that the recording cameras are un-calibrated and the angles of surveillance are unknown.
arXiv Detail & Related papers (2020-03-02T22:17:46Z) - Training-free Monocular 3D Event Detection System for Traffic
Surveillance [93.65240041833319]
Existing event detection systems are mostly learning-based and have achieved convincing performance when a large amount of training data is available.
In real-world scenarios, collecting sufficient labeled training data is expensive and sometimes impossible.
We propose a training-free monocular 3D event detection system for traffic surveillance.
arXiv Detail & Related papers (2020-02-01T04:42:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.