RiskCueBench: Benchmarking Anticipatory Reasoning from Early Risk Cues in Video-Language Models
- URL: http://arxiv.org/abs/2601.03369v1
- Date: Tue, 06 Jan 2026 19:14:49 GMT
- Title: RiskCueBench: Benchmarking Anticipatory Reasoning from Early Risk Cues in Video-Language Models
- Authors: Sha Luo, Yogesh Prabhu, Tim Ossowski, Kaiping Chen, Junjie Hu,
- Abstract summary: We introduce a new video understanding benchmark RiskCueBench in which videos are carefully annotated to identify a risk signal clip.<n> Experimental results reveal a significant gap in current systems ability to interpret evolving situations and anticipate future risky events from early visual signals.
- Score: 4.858738694604317
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the rapid growth of video centered social media, the ability to anticipate risky events from visual data is a promising direction for ensuring public safety and preventing real world accidents. Prior work has extensively studied supervised video risk assessment across domains such as driving, protests, and natural disasters. However, many existing datasets provide models with access to the full video sequence, including the accident itself, which substantially reduces the difficulty of the task. To better reflect real world conditions, we introduce a new video understanding benchmark RiskCueBench in which videos are carefully annotated to identify a risk signal clip, defined as the earliest moment that indicates a potential safety concern. Experimental results reveal a significant gap in current systems ability to interpret evolving situations and anticipate future risky events from early visual signals, highlighting important challenges for deploying video risk prediction models in practice.
Related papers
- The Missing Half: Unveiling Training-time Implicit Safety Risks Beyond Deployment [148.80266237240713]
implicit training-time safety risks are driven by a model's internal incentives and contextual background information.<n>We present the first systematic study of this problem, introducing a taxonomy with five risk levels, ten fine-grained risk categories, and three incentive types.<n>Our results identify an overlooked yet urgent safety challenge in training.
arXiv Detail & Related papers (2026-02-04T04:23:58Z) - From Pretrain to Pain: Adversarial Vulnerability of Video Foundation Models Without Task Knowledge [57.379583179331426]
This paper investigates a novel and practical adversarial threat scenario: attacking downstream models or MLLMs fine-tuned from open-source VFMs.<n>We propose the Transferable Video Attack (TVA), a temporal-aware adversarial attack method that leverages the temporal representation dynamics of VFMs to craft effective perturbations.<n>TVA avoids the need to train expensive surrogate models or access to domain-specific data, thereby offering a more practical and efficient attack strategy.
arXiv Detail & Related papers (2025-11-10T12:42:32Z) - AccidentBench: Benchmarking Multimodal Understanding and Reasoning in Vehicle Accidents and Beyond [101.20320617562321]
AccidentBench is a large-scale benchmark that combines vehicle accident scenarios with Beyond domains.<n>The benchmark contains approximately 2000 videos and over 19000 human-annotated question-answer pairs.
arXiv Detail & Related papers (2025-09-30T17:59:13Z) - Understanding and Benchmarking the Trustworthiness in Multimodal LLMs for Video Understanding [59.50808215134678]
This study introduces Trust-videoLLMs, a first comprehensive benchmark evaluating 23 state-of-the-art videoLLMs.<n>Results reveal significant limitations in dynamic scene comprehension, cross-modal resilience and real-world risk mitigation.
arXiv Detail & Related papers (2025-06-14T04:04:54Z) - Video-SafetyBench: A Benchmark for Safety Evaluation of Video LVLMs [45.265397990158846]
Video-SafetyBench is the first benchmark designed to evaluate the safety of LVLMs under video-text attacks.<n>It comprises 2,264 video-text pairs spanning 48 fine-grained unsafe categories.<n>To generate semantically accurate videos for safety evaluation, we design a controllable pipeline that decomposes video semantics into subject images and motion text.
arXiv Detail & Related papers (2025-05-17T05:06:38Z) - RiskNet: Interaction-Aware Risk Forecasting for Autonomous Driving in Long-Tail Scenarios [6.024186631622774]
RiskNet is an interaction-aware risk forecasting framework for autonomous vehicles.<n>It integrates deterministic risk modeling with probabilistic behavior prediction for comprehensive risk assessment.<n>It supports real-time, scenario-adaptive risk forecasting and demonstrates strong generalization across uncertain driving environments.
arXiv Detail & Related papers (2025-04-22T02:36:54Z) - OpenAI o1 System Card [274.83891368890977]
The o1 model series is trained with large-scale reinforcement learning to reason using chain of thought.<n>This report outlines the safety work carried out for the OpenAI o1 and OpenAI o1-mini models, including safety evaluations, external red teaming, and Preparedness Framework evaluations.
arXiv Detail & Related papers (2024-12-21T18:04:31Z) - EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.<n>Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.<n>However, the deployment of these agents in physical environments presents significant safety challenges.<n>This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - Risk-Aware Vehicle Trajectory Prediction Under Safety-Critical Scenarios [25.16311876790003]
This paper proposes a risk-aware trajectory prediction framework tailored to safety-critical scenarios.
We introduce a safety-critical trajectory prediction dataset and tailored evaluation metrics.
Results demonstrate the superior performance of our model, with a significant improvement in most metrics.
arXiv Detail & Related papers (2024-07-18T13:00:01Z) - T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models [39.15695612766001]
We introduce T2VSafetyBench, a new benchmark for safety-critical assessments of text-to-video models.
We define 12 critical aspects of video generation safety and construct a malicious prompt dataset.
No single model excels in all aspects, with different models showing various strengths.
There is a trade-off between the usability and safety of text-to-video generative models.
arXiv Detail & Related papers (2024-07-08T14:04:58Z) - Uncertainty-Aware Probabilistic Graph Neural Networks for Road-Level Traffic Accident Prediction [6.570852598591727]
We introduce the Stemporal Zero-Inflated Tweedie Graph Neural Network STZITZTDGNN -- the first uncertainty-aware graph deep learning model in road traffic accident prediction for multisteps.
Our study demonstrates that STIDGNN can effectively inform targeted road monitoring, thereby improving urban road safety strategies.
arXiv Detail & Related papers (2023-09-10T16:35:47Z) - Safety Margins for Reinforcement Learning [53.10194953873209]
We show how to leverage proxy criticality metrics to generate safety margins.
We evaluate our approach on learned policies from APE-X and A3C within an Atari environment.
arXiv Detail & Related papers (2023-07-25T16:49:54Z) - Uncertainty-based Traffic Accident Anticipation with Spatio-Temporal
Relational Learning [30.59728753059457]
Traffic accident anticipation aims to predict accidents from dashcam videos as early as possible.
Current deterministic deep neural networks could be overconfident in false predictions.
We propose an uncertainty-based accident anticipation model with relational-temporal learning.
arXiv Detail & Related papers (2020-08-01T20:21:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.