Related papers: Text-Driven Traffic Anomaly Detection with Temporal High-Frequency Modeling in Driving Videos

Text-Driven Traffic Anomaly Detection with Temporal High-Frequency Modeling in Driving Videos

URL: http://arxiv.org/abs/2401.03522v2
Date: Mon, 15 Apr 2024 07:59:03 GMT
Title: Text-Driven Traffic Anomaly Detection with Temporal High-Frequency Modeling in Driving Videos
Authors: Rongqin Liang, Yuanman Li, Jiantao Zhou, Xia Li,
Abstract summary: We introduce TTHF, a novel single-stage method aligning video clips with text prompts, offering a new perspective on traffic anomaly detection. Unlike previous approaches, the supervised signal of our method is derived from languages rather than one-hot vectors, providing a more comprehensive representation. It is shown that our proposed TTHF achieves promising performance, outperforming state-of-the-art competitors by +5.4% AUC on the DoTA dataset.
Score: 22.16190711818432
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Traffic anomaly detection (TAD) in driving videos is critical for ensuring the safety of autonomous driving and advanced driver assistance systems. Previous single-stage TAD methods primarily rely on frame prediction, making them vulnerable to interference from dynamic backgrounds induced by the rapid movement of the dashboard camera. While two-stage TAD methods appear to be a natural solution to mitigate such interference by pre-extracting background-independent features (such as bounding boxes and optical flow) using perceptual algorithms, they are susceptible to the performance of first-stage perceptual algorithms and may result in error propagation. In this paper, we introduce TTHF, a novel single-stage method aligning video clips with text prompts, offering a new perspective on traffic anomaly detection. Unlike previous approaches, the supervised signal of our method is derived from languages rather than orthogonal one-hot vectors, providing a more comprehensive representation. Further, concerning visual representation, we propose to model the high frequency of driving videos in the temporal domain. This modeling captures the dynamic changes of driving scenes, enhances the perception of driving behavior, and significantly improves the detection of traffic anomalies. In addition, to better perceive various types of traffic anomalies, we carefully design an attentive anomaly focusing mechanism that visually and linguistically guides the model to adaptively focus on the visual context of interest, thereby facilitating the detection of traffic anomalies. It is shown that our proposed TTHF achieves promising performance, outperforming state-of-the-art competitors by +5.4% AUC on the DoTA dataset and achieving high generalization on the DADA dataset.

Related papers

Tracking the Unstable: Appearance-Guided Motion Modeling for Robust Multi-Object Tracking in UAV-Captured Videos [58.156141601478794]
Multi-object tracking (UAVT) aims to track multiple objects while maintaining consistent identities across frames of a given video.<n>Existing methods typically model motion cues and appearance separately, overlooking their interplay and resulting in suboptimal tracking performance.<n>We propose AMOT, which exploits appearance and motion cues through two key components: an Appearance-Motion Consistency (AMC) matrix and a Motion-aware Track Continuation (MTC) module.
arXiv Detail & Related papers (2025-08-03T12:06:47Z)
A Driving Regime-Embedded Deep Learning Framework for Modeling Intra-Driver Heterogeneity in Multi-Scale Car-Following Dynamics [5.579243411257874]
We propose a novel data-driven car-following framework that embeds discrete driving regimes into vehicular motion predictions.<n>The proposed hybrid deep learning architecture combines Gated Recurrent Units for discrete driving regime classification with Long Short-Term Memory networks for continuous kinematic prediction.<n>The framework significantly reduces prediction errors for acceleration (maximum MSE improvement reached 58.47%), speed, and spacing metrics while reproducing critical traffic phenomena.
arXiv Detail & Related papers (2025-06-06T09:19:33Z)
SlowFastVAD: Video Anomaly Detection via Integrating Simple Detector and RAG-Enhanced Vision-Language Model [52.47816604709358]
Video anomaly detection (VAD) aims to identify unexpected events in videos and has wide applications in safety-critical domains. vision-language models (VLMs) have demonstrated strong multimodal reasoning capabilities, offering new opportunities for anomaly detection. We propose SlowFastVAD, a hybrid framework that integrates a fast anomaly detector with a slow anomaly detector.
arXiv Detail & Related papers (2025-04-14T15:30:03Z)
An object detection approach for lane change and overtake detection from motion profiles [3.545178658731506]
In this paper, we address the identification of overtake and lane change maneuvers with a novel object detection approach applied to motion profiles. To train and test our model we created an internal dataset of motion profile images obtained from a heterogeneous set of dashcam videos. In addition to a standard object-detection approach, we show how the inclusion of CoordConvolution layers further improves the model performance.
arXiv Detail & Related papers (2025-02-06T17:36:35Z)
Multi-Modality Driven LoRA for Adverse Condition Depth Estimation [61.525312117638116]
We propose Multi-Modality Driven LoRA (MMD-LoRA) for Adverse Condition Depth Estimation. It consists of two core components: Prompt Driven Domain Alignment (PDDA) and Visual-Text Consistent Contrastive Learning (VTCCL) It achieves state-of-the-art performance on the nuScenes and Oxford RobotCar datasets.
arXiv Detail & Related papers (2024-12-28T14:23:58Z)
FollowGen: A Scaled Noise Conditional Diffusion Model for Car-Following Trajectory Prediction [9.2729178775419]
This study introduces a scaled noise conditional diffusion model for car-following trajectory prediction. It integrates detailed inter-vehicular interactions and car-following dynamics into a generative framework, improving the accuracy and plausibility of predicted trajectories. Experimental results on diverse real-world driving scenarios demonstrate the state-of-the-art performance and robustness of the proposed method.
arXiv Detail & Related papers (2024-11-23T23:13:45Z)
Cross-Camera Distracted Driver Classification through Feature Disentanglement and Contrastive Learning [13.613407983544427]
We introduce a robust model designed to withstand changes in camera position within the vehicle. Our Driver Behavior Monitoring Network (DBMNet) relies on a lightweight backbone and integrates a disentanglement module. Experiments conducted on the daytime and nighttime subsets of the 100-Driver dataset validate the effectiveness of our approach.
arXiv Detail & Related papers (2024-11-20T10:27:12Z)
Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts [57.01985221057047]
This paper introduces a novel method that learnstemporal prompt embeddings for weakly supervised video anomaly detection and localization (WSVADL) based on pre-trained vision-language models (VLMs) Our method achieves state-of-theart performance on three public benchmarks for the WSVADL task.
arXiv Detail & Related papers (2024-08-12T03:31:29Z)
Layout Sequence Prediction From Noisy Mobile Modality [53.49649231056857]
Trajectory prediction plays a vital role in understanding pedestrian movement for applications such as autonomous driving and robotics. Current trajectory prediction models depend on long, complete, and accurately observed sequences from visual modalities. We propose LTrajDiff, a novel approach that treats objects obstructed or out of sight as equally important as those with fully visible trajectories.
arXiv Detail & Related papers (2023-10-09T20:32:49Z)
Unsupervised Domain Adaptation for Self-Driving from Past Traversal Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments. Our approach enhances LiDAR-based detection models using spatial quantized historical features. Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z)
A Memory-Augmented Multi-Task Collaborative Framework for Unsupervised Traffic Accident Detection in Driving Videos [22.553356096143734]
We propose a novel memory-augmented multi-task collaborative framework (MAMTCF) for unsupervised traffic accident detection in driving videos. Our method can more accurately detect both ego-involved and non-ego accidents by simultaneously modeling appearance changes and object motions in video frames.
arXiv Detail & Related papers (2023-07-27T01:45:13Z)
FBLNet: FeedBack Loop Network for Driver Attention Prediction [75.83518507463226]
Nonobjective driving experience is difficult to model. In this paper, we propose a FeedBack Loop Network (FBLNet) which attempts to model the driving experience accumulation procedure. Under the guidance of the incremental knowledge, our model fuses the CNN feature and Transformer feature that are extracted from the input image to predict driver attention.
arXiv Detail & Related papers (2022-12-05T08:25:09Z)
Real-Time Driver Monitoring Systems through Modality and View Analysis [28.18784311981388]
Driver distractions are known to be the dominant cause of road accidents. State-of-the-art methods prioritize accuracy while ignoring latency. We propose time-effective detection models by neglecting the temporal relation between video frames.
arXiv Detail & Related papers (2022-10-17T21:22:41Z)
Implicit Motion Handling for Video Camouflaged Object Detection [60.98467179649398]
We propose a new video camouflaged object detection (VCOD) framework. It can exploit both short-term and long-term temporal consistency to detect camouflaged objects from video frames.
arXiv Detail & Related papers (2022-03-14T17:55:41Z)
Robust Unsupervised Video Anomaly Detection by Multi-Path Frame Prediction [61.17654438176999]
We propose a novel and robust unsupervised video anomaly detection method by frame prediction with proper design. Our proposed method obtains the frame-level AUROC score of 88.3% on the CUHK Avenue dataset.
arXiv Detail & Related papers (2020-11-05T11:34:12Z)
Anomalous Motion Detection on Highway Using Deep Learning [14.617786106427834]
This paper presents a new anomaly detection dataset - the Highway Traffic Anomaly (HTA) dataset. We evaluate state-of-the-art deep learning anomaly detection models and propose novel variations to these methods.
arXiv Detail & Related papers (2020-06-15T05:40:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.