SAVeD: A First-Person Social Media Video Dataset for ADAS-equipped vehicle Near-Miss and Crash Event Analyses
- URL: http://arxiv.org/abs/2512.17724v1
- Date: Fri, 19 Dec 2025 15:58:52 GMT
- Title: SAVeD: A First-Person Social Media Video Dataset for ADAS-equipped vehicle Near-Miss and Crash Event Analyses
- Authors: Shaoyan Zhai, Mohamed Abdel-Aty, Chenzhu Wang, Rodrigo Vena Garcia,
- Abstract summary: This paper introduces SAVeD, a large-scale video dataset curated from publicly available social media content.<n>SAVED features 2,119 first-person videos, capturing ADAS vehicle operations in diverse locations, lighting conditions, and weather scenarios.<n>The dataset includes video frame-level annotations for collisions, evasive maneuvers, and disengagements, enabling analysis of both perception and decision-making failures.
- Score: 0.7874708385247353
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The advancement of safety-critical research in driving behavior in ADAS-equipped vehicles require real-world datasets that not only include diverse traffic scenarios but also capture high-risk edge cases such as near-miss events and system failures. However, existing datasets are largely limited to either simulated environments or human-driven vehicle data, lacking authentic ADAS (Advanced Driver Assistance System) vehicle behavior under risk conditions. To address this gap, this paper introduces SAVeD, a large-scale video dataset curated from publicly available social media content, explicitly focused on ADAS vehicle-related crashes, near-miss incidents, and disengagements. SAVeD features 2,119 first-person videos, capturing ADAS vehicle operations in diverse locations, lighting conditions, and weather scenarios. The dataset includes video frame-level annotations for collisions, evasive maneuvers, and disengagements, enabling analysis of both perception and decision-making failures. We demonstrate SAVeD's utility through multiple analyses and contributions: (1) We propose a novel framework integrating semantic segmentation and monocular depth estimation to compute real-time Time-to-Collision (TTC) for dynamic objects. (2) We utilize the Generalized Extreme Value (GEV) distribution to model and quantify the extreme risk in crash and near-miss events across different roadway types. (3) We establish benchmarks for state-of-the-art VLLMs (VideoLLaMA2 and InternVL2.5 HiCo R16), showing that SAVeD's detailed annotations significantly enhance model performance through domain adaptation in complex near-miss scenarios.
Related papers
- Data-Driven Analysis of Crash Patterns in SAE Level 2 and Level 4 Automated Vehicles Using K-means Clustering and Association Rule Mining [0.17205106391379021]
Automated Vehicles (AV) hold potential to reduce or eliminate human driving errors, enhance traffic safety, and support sustainable mobility.<n>Recently, crash data has increasingly revealed that AV behavior can deviate from expected safety outcomes, raising concerns about the technology's safety and operational reliability in mixed traffic environments.<n>This study analyzes over 2,500 AV crash records from the United States National Highway Traffic Safety Administration (NHTSA), covering SAE Levels 2 and 4 to uncover underlying crash dynamics.
arXiv Detail & Related papers (2025-12-27T13:30:07Z) - From Narratives to Probabilistic Reasoning: Predicting and Interpreting Drivers' Hazardous Actions in Crashes Using Large Language Model [3.3457493284891338]
Two-vehicle crashes account for approximately 70% of roadway crashes.<n>Driver Hazardous Action (DHA) data is limited by inconsistent and labor-intensive manual coding practices.<n>Here, we present an innovative framework that leverages a fine-tuned large language model to automatically infer DHAs from textual crash narratives.
arXiv Detail & Related papers (2025-10-14T21:35:47Z) - AccidentBench: Benchmarking Multimodal Understanding and Reasoning in Vehicle Accidents and Beyond [101.20320617562321]
AccidentBench is a large-scale benchmark that combines vehicle accident scenarios with Beyond domains.<n>The benchmark contains approximately 2000 videos and over 19000 human-annotated question-answer pairs.
arXiv Detail & Related papers (2025-09-30T17:59:13Z) - CoReVLA: A Dual-Stage End-to-End Autonomous Driving Framework for Long-Tail Scenarios via Collect-and-Refine [73.74077186298523]
CoReVLA is a continual learning framework for autonomous driving.<n>It improves the performance in long-tail scenarios through a dual-stage process of data Collection and behavior Refinement.<n>CoReVLA achieves a Driving Score (DS) of 72.18 and a Success Rate (SR) of 50%, outperforming state-of-the-art methods by 7.96 DS and 15% SR under long-tail, safety-critical scenarios.
arXiv Detail & Related papers (2025-09-19T13:25:56Z) - Towards Intelligent Transportation with Pedestrians and Vehicles In-the-Loop: A Surveillance Video-Assisted Federated Digital Twin Framework [62.47416496137193]
We propose a surveillance video assisted federated digital twin (SV-FDT) framework to empower ITSs with pedestrians and vehicles in-the-loop.<n>The architecture consists of three layers: (i) the end layer, which collects traffic surveillance videos from multiple sources; (ii) the edge layer, responsible for semantic segmentation-based visual understanding, twin agent-based interaction modeling, and local digital twin system (LDTS) creation in local regions; and (iii) the cloud layer, which integrates LDTSs across different regions to construct a global DT model in realtime.
arXiv Detail & Related papers (2025-03-06T07:36:06Z) - CRASH: Crash Recognition and Anticipation System Harnessing with Context-Aware and Temporal Focus Attentions [13.981748780317329]
Accurately and promptly predicting accidents among surrounding traffic agents from camera footage is crucial for the safety of autonomous vehicles (AVs)
This study introduces a novel accident anticipation framework for AVs, termed CRASH.
It seamlessly integrates five components: object detector, feature extractor, object-aware module, context-aware module, and multi-layer fusion.
Our model surpasses existing top baselines in critical evaluation metrics like Average Precision (AP) and mean Time-To-Accident (mTTA)
arXiv Detail & Related papers (2024-07-25T04:12:49Z) - AccidentBlip: Agent of Accident Warning based on MA-former [24.81148840857782]
AccidentBlip is a vision-only framework that employs our self-designed Motion Accident Transformer (MA-former) to process each frame of video.<n> AccidentBlip achieves performance in both accident detection and prediction tasks on the DeepAccident dataset.<n>It also outperforms current SOTA methods in V2V and V2X scenarios, demonstrating a superior capability to understand complex real-world environments.
arXiv Detail & Related papers (2024-04-18T12:54:25Z) - Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction [69.29802752614677]
RouteFormer is a novel ego-trajectory prediction network combining GPS data, environmental context, and the driver's field-of-view.<n>To tackle data scarcity and enhance diversity, we introduce GEM, a dataset of urban driving scenarios enriched with synchronized driver field-of-view and gaze data.
arXiv Detail & Related papers (2023-12-13T23:06:30Z) - DeepAccident: A Motion and Accident Prediction Benchmark for V2X
Autonomous Driving [76.29141888408265]
We propose a large-scale dataset containing diverse accident scenarios that frequently occur in real-world driving.
The proposed DeepAccident dataset includes 57K annotated frames and 285K annotated samples, approximately 7 times more than the large-scale nuScenes dataset.
arXiv Detail & Related papers (2023-04-03T17:37:00Z) - Unsupervised Driving Event Discovery Based on Vehicle CAN-data [62.997667081978825]
This work presents a simultaneous clustering and segmentation approach for vehicle CAN-data that identifies common driving events in an unsupervised manner.
We evaluate our approach with a dataset of real Tesla Model 3 vehicle CAN-data and a two-hour driving session that we annotated with different driving events.
arXiv Detail & Related papers (2023-01-12T13:10:47Z) - Augmenting Ego-Vehicle for Traffic Near-Miss and Accident Classification
Dataset using Manipulating Conditional Style Translation [0.3441021278275805]
There is no difference between accident and near-miss at the time before the accident happened.
Our contribution is to redefine the accident definition and re-annotate the accident inconsistency on DADA-2000 dataset together with near-miss.
The proposed method integrates two different components: conditional style translation (CST) and separable 3-dimensional convolutional neural network (S3D)
arXiv Detail & Related papers (2023-01-06T22:04:47Z) - An Attention-guided Multistream Feature Fusion Network for Localization
of Risky Objects in Driving Videos [10.674638266121574]
This paper proposes an attention-guided multistream feature fusion network (AM-Net) to localize dangerous traffic agents from dashcam videos.
Two Gated Recurrent Unit (GRU) networks use object bounding box and optical flow features extracted from consecutive video frames to capturetemporal cues for distinguishing dangerous traffic agents.
Fusing the two streams of features, AM-Net predicts the riskiness scores of traffic agents in the video.
arXiv Detail & Related papers (2022-09-16T13:36:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.