Related papers: Spatiotemporal Semantic V2X Framework for Cooperative Collision Prediction

Spatiotemporal Semantic V2X Framework for Cooperative Collision Prediction

URL: http://arxiv.org/abs/2601.17216v2
Date: Tue, 27 Jan 2026 21:44:41 GMT
Title: Spatiotemporal Semantic V2X Framework for Cooperative Collision Prediction
Authors: Murat Arda Onsu, Poonam Lohan, Burak Kantarci, Aisha Syed, Matthew Andrews, Sean Kennedy,
Abstract summary: Transportation Systems (ITS) demand real-time collision prediction to ensure road safety and reduce accident severity.<n> Conventional approaches rely on transmitting raw or high-dimensional sensory data from roadside units (RSUs) to vehicles.<n>We propose a semantic V2X framework in which RSU-mounted video cameras generate semantic embeddings of future frames.
Score: 5.862522659881676
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Intelligent Transportation Systems (ITS) demand real-time collision prediction to ensure road safety and reduce accident severity. Conventional approaches rely on transmitting raw video or high-dimensional sensory data from roadside units (RSUs) to vehicles, which is impractical under vehicular communication bandwidth and latency constraints. In this work, we propose a semantic V2X framework in which RSU-mounted cameras generate spatiotemporal semantic embeddings of future frames using the Video Joint Embedding Predictive Architecture (V-JEPA). To evaluate the system, we construct a digital twin of an urban traffic environment enabling the generation of d verse traffic scenarios with both safe and collision events. These embeddings of the future frame, extracted from V-JEPA, capture task-relevant traffic dynamics and are transmitted via V2X links to vehicles, where a lightweight attentive probe and classifier decode them to predict imminent collisions. By transmitting only semantic embeddings instead of raw frames, the proposed system significantly reduces communication overhead while maintaining predictive accuracy. Experimental results demonstrate that the framework with an appropriate processing method achieves a 10% F1-score improvement for collision prediction while reducing transmission requirements by four orders of magnitude compared to raw video. This validates the potential of semantic V2X communication to enable cooperative, real-time collision prediction in ITS.

Related papers

Attention in Motion: Secure Platooning via Transformer-based Misbehavior Detection [0.6999740786886536]
Vehicular platooning promises transformative improvements in transportation efficiency and safety through the coordination of multi-vehicle formations.<n>Traditional misbehaviour detection approaches, which rely on plausibility checks and statistical methods, suffer from high False Positive (FP) rates.<n>We present Attention In Motion (AIMformer), a transformer-based framework specifically tailored for real-time misbehaviour detection in vehicular platoons.
arXiv Detail & Related papers (2025-12-17T14:45:33Z)
SemAgent: Semantic-Driven Agentic AI Empowered Trajectory Prediction in Vehicular Networks [26.85167428129155]
This paper presents a trajectory prediction framework that integrates semantic communication with Agentic AI to enhance predictive performance in vehicular environments.<n>In vehicle-to-infrastructure (V2I) communication, a feature-extraction agent at the Roadside Unit (RSU) derives compact representations from historical vehicle trajectories, followed by semantic reasoning performed by a semantic-analysis agent.<n>The RSU then transmits both feature representations and semantic insights to the target vehicle via semantic communication, enabling the vehicle to predict future trajectories by combining received semantics with its own historical data.
arXiv Detail & Related papers (2025-11-30T11:06:58Z)
Edge-Assisted ML-Aided Uncertainty-Aware Vehicle Collision Avoidance at Urban Intersections [12.812518632907771]
We present a novel framework that detects preemptively collisions at urban crossroads. We exploit the Multi-access Edge Computing platform of 5G networks.
arXiv Detail & Related papers (2024-04-22T18:45:40Z)
AccidentBlip: Agent of Accident Warning based on MA-former [24.81148840857782]
AccidentBlip is a vision-only framework that employs our self-designed Motion Accident Transformer (MA-former) to process each frame of video.<n> AccidentBlip achieves performance in both accident detection and prediction tasks on the DeepAccident dataset.<n>It also outperforms current SOTA methods in V2V and V2X scenarios, demonstrating a superior capability to understand complex real-world environments.
arXiv Detail & Related papers (2024-04-18T12:54:25Z)
DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving [76.29141888408265]
We propose a large-scale dataset containing diverse accident scenarios that frequently occur in real-world driving. The proposed DeepAccident dataset includes 57K annotated frames and 285K annotated samples, approximately 7 times more than the large-scale nuScenes dataset.
arXiv Detail & Related papers (2023-04-03T17:37:00Z)
Context-Aware Target Classification with Hybrid Gaussian Process prediction for Cooperative Vehicle Safety systems [2.862606936691229]
Vehicle-to-Everything (V2X) communication has been proposed as a potential solution to improve the robustness and safety of autonomous vehicles. Cooperative Vehicle Safety (CVS) applications are tightly dependent on the reliability of the underneath data system. We propose a Context-Aware Target Classification (CA-TC) module and a hybrid learning-based predictive modeling technique for CVS systems.
arXiv Detail & Related papers (2022-12-24T22:03:08Z)
Cognitive Accident Prediction in Driving Scenes: A Multimodality Benchmark [77.54411007883962]
We propose a Cognitive Accident Prediction (CAP) method that explicitly leverages human-inspired cognition of text description on the visual observation and the driver attention to facilitate model training. CAP is formulated by an attentive text-to-vision shift fusion module, an attentive scene context transfer module, and the driver attention guided accident prediction module. We construct a new large-scale benchmark consisting of 11,727 in-the-wild accident videos with over 2.19 million frames.
arXiv Detail & Related papers (2022-12-19T11:43:02Z)
COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked Vehicles [54.61668577827041]
We introduce COOPERNAUT, an end-to-end learning model that uses cross-vehicle perception for vision-based cooperative driving. Our experiments on AutoCastSim suggest that our cooperative perception driving models lead to a 40% improvement in average success rate.
arXiv Detail & Related papers (2022-05-04T17:55:12Z)
Safety-Oriented Pedestrian Motion and Scene Occupancy Forecasting [91.69900691029908]
We advocate for predicting both the individual motions as well as the scene occupancy map. We propose a Scene-Actor Graph Neural Network (SA-GNN) which preserves the relative spatial information of pedestrians. On two large-scale real-world datasets, we showcase that our scene-occupancy predictions are more accurate and better calibrated than those from state-of-the-art motion forecasting methods.
arXiv Detail & Related papers (2021-01-07T06:08:21Z)
Implicit Latent Variable Model for Scene-Consistent Motion Forecasting [78.74510891099395]
In this paper, we aim to learn scene-consistent motion forecasts of complex urban traffic directly from sensor data. We model the scene as an interaction graph and employ powerful graph neural networks to learn a distributed latent representation of the scene.
arXiv Detail & Related papers (2020-07-23T14:31:25Z)
TPNet: Trajectory Proposal Network for Motion Prediction [81.28716372763128]
Trajectory Proposal Network (TPNet) is a novel two-stage motion prediction framework. TPNet first generates a candidate set of future trajectories as hypothesis proposals, then makes the final predictions by classifying and refining the proposals. Experiments on four large-scale trajectory prediction datasets, show that TPNet achieves the state-of-the-art results both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-04-26T00:01:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.