Related papers: Semantic4Safety: Causal Insights from Zero-shot Street View Imagery Segmentation for Urban Road Safety

Semantic4Safety: Causal Insights from Zero-shot Street View Imagery Segmentation for Urban Road Safety

URL: http://arxiv.org/abs/2510.15434v1
Date: Fri, 17 Oct 2025 08:45:28 GMT
Title: Semantic4Safety: Causal Insights from Zero-shot Street View Imagery Segmentation for Urban Road Safety
Authors: Huan Chen, Ting Han, Siyu Chen, Zhihao Guo, Yiping Chen, Meiliu Wu,
Abstract summary: We propose a framework that applies zero-shot semantic segmentation to street-view imagery to derive 11 interpretable streetscape indicators.<n>We analyze approximately 30,000 accident records in Austin to uncover accident-type-specific causal patterns.<n>By bridging predictive modeling with causal inference, Semantic4Safety supports targeted interventions and high-risk corridor diagnosis.
Score: 14.300819977717708
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Street-view imagery (SVI) offers a fine-grained lens on traffic risk, yet two fundamental challenges persist: (1) how to construct street-level indicators that capture accident-related features, and (2) how to quantify their causal impacts across different accident types. To address these challenges, we propose Semantic4Safety, a framework that applies zero-shot semantic segmentation to SVIs to derive 11 interpretable streetscape indicators, and integrates road type as contextual information to analyze approximately 30,000 accident records in Austin. Specifically, we train an eXtreme Gradient Boosting (XGBoost) multi-class classifier and use Shapley Additive Explanations (SHAP) to interpret both global and local feature contributions, and then apply Generalized Propensity Score (GPS) weighting and Average Treatment Effect (ATE) estimation to control confounding and quantify causal effects. Results uncover heterogeneous, accident-type-specific causal patterns: features capturing scene complexity, exposure, and roadway geometry dominate predictive power; larger drivable area and emergency space reduce risk, whereas excessive visual openness can increase it. By bridging predictive modeling with causal inference, Semantic4Safety supports targeted interventions and high-risk corridor diagnosis, offering a scalable, data-informed tool for urban road safety planning.

Related papers

Hazard-Aware Traffic Scene Graph Generation [9.549979680242195]
We introduce a novel task, Traffic Scene Graph Generation, which captures traffic-specific relations between prominent hazards and the ego vehicle.<n>We propose a novel framework that explicitly uses traffic accident data and depth cues to supplement visual features and semantic information for reasoning.<n>The output traffic scene graphs provide intuitive guidelines that stress prominent hazards by color-coding their severity and notating their effect mechanism and relative location to the ego vehicle.
arXiv Detail & Related papers (2026-03-03T23:29:11Z)
An Integrated Causal Inference Framework for Traffic Safety Modeling with Semantic Street-View Visual Features [9.356306102405549]
We applied semantic segmentation on Google Street View imageries to extract visual environmental features.<n>We used SHAP values to characterize the nonlinear influence mechanisms of confounding variables in the models and applied causal forests to estimate conditional average treatment effects.<n>Our findings provide causal evidence for greening as a potential safety intervention, prioritizing hazardous visual environments while highlighting the need for design optimizations to protect vulnerable road users (VRUs)
arXiv Detail & Related papers (2026-02-12T13:47:56Z)
Building a Foundational Guardrail for General Agentic Systems via Synthetic Data [76.18834864749606]
LLM agents can plan multi-step tasks, intervening at the planning stage-before any action is executed-is often the safest way to prevent harm.<n>Existing guardrails mostly operate post-execution, which is difficult to scale and leaves little room for controllable supervision at the plan level.<n>We introduce AuraGen, a controllable engine that synthesizes benign trajectories, injects category-labeled risks with difficulty, and filters outputs via an automated reward model.
arXiv Detail & Related papers (2025-10-10T18:42:32Z)
Towards Reliable and Interpretable Traffic Crash Pattern Prediction and Safety Interventions Using Customized Large Language Models [14.53510262691888]
TrafficSafe is a framework that adapts to reframe crash prediction and feature attribution as text-level reasoning.<n>Alcohol-impaired driving is the leading factor in severe crashes.<n>TrafficSafe highlights pivotal features during model training guiding strategic crash data collection improvements.
arXiv Detail & Related papers (2025-05-18T21:02:30Z)
The Impact of Building-Induced Visibility Restrictions on Intersection Accidents [32.698248334786854]
We develop a novel approach to estimate accident risk at intersections.<n>Method factors in the area visible to drivers, accounting for views blocked by buildings.<n>Findings reveal a notable correlation between the road visible percentage and accident frequency.
arXiv Detail & Related papers (2025-02-13T17:45:51Z)
StreetSurfGS: Scalable Urban Street Surface Reconstruction with Planar-based Gaussian Splatting [85.67616000086232]
StreetSurfGS is first method to employ Gaussian Splatting specifically tailored for scalable urban street scene surface reconstruction. StreetSurfGS utilizes a planar-based octree representation and segmented training to reduce memory costs, accommodate unique camera characteristics, and ensure scalability. To address sparse views and multi-scale challenges, we use a dual-step matching strategy that leverages adjacent and long-term information.
arXiv Detail & Related papers (2024-10-06T04:21:59Z)
Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses [76.59021017301127]
We propose a large-scale traffic crash language dataset, named CrashEvent, summarizing 19,340 real-world crash reports. We further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes. Our experiments results show that our LLM-based approach not only predicts the severity of accidents but also classifies different types of accidents and predicts injury outcomes.
arXiv Detail & Related papers (2024-06-16T03:10:16Z)
Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction [69.29802752614677]
RouteFormer is a novel ego-trajectory prediction network combining GPS data, environmental context, and the driver's field-of-view.<n>To tackle data scarcity and enhance diversity, we introduce GEM, a dataset of urban driving scenarios enriched with synchronized driver field-of-view and gaze data.
arXiv Detail & Related papers (2023-12-13T23:06:30Z)
OpenLane-V2: A Topology Reasoning Benchmark for Unified 3D HD Mapping [84.65114565766596]
We present OpenLane-V2, the first dataset on topology reasoning for traffic scene structure. OpenLane-V2 consists of 2,000 annotated road scenes that describe traffic elements and their correlation to the lanes. We evaluate various state-of-the-art methods, and present their quantitative and qualitative results on OpenLane-V2 to indicate future avenues for investigating topology reasoning in traffic scenes.
arXiv Detail & Related papers (2023-04-20T16:31:22Z)
Intersection Warning System for Occlusion Risks using Relational Local Dynamic Maps [0.0]
This work addresses the task of risk evaluation in traffic scenarios with limited observability due to restricted sensorial coverage. To identify the area of sight, we employ ray casting on a local dynamic map providing geometrical information and road infrastructure. Resulting risk indicators are utilized to evaluate the driver's current behavior, to warn the driver in critical situations, to give suggestions on how to act safely or to plan safe trajectories.
arXiv Detail & Related papers (2023-03-13T16:01:55Z)
Augmenting Ego-Vehicle for Traffic Near-Miss and Accident Classification Dataset using Manipulating Conditional Style Translation [0.3441021278275805]
There is no difference between accident and near-miss at the time before the accident happened. Our contribution is to redefine the accident definition and re-annotate the accident inconsistency on DADA-2000 dataset together with near-miss. The proposed method integrates two different components: conditional style translation (CST) and separable 3-dimensional convolutional neural network (S3D)
arXiv Detail & Related papers (2023-01-06T22:04:47Z)
Cognitive Accident Prediction in Driving Scenes: A Multimodality Benchmark [77.54411007883962]
We propose a Cognitive Accident Prediction (CAP) method that explicitly leverages human-inspired cognition of text description on the visual observation and the driver attention to facilitate model training. CAP is formulated by an attentive text-to-vision shift fusion module, an attentive scene context transfer module, and the driver attention guided accident prediction module. We construct a new large-scale benchmark consisting of 11,727 in-the-wild accident videos with over 2.19 million frames.
arXiv Detail & Related papers (2022-12-19T11:43:02Z)
Dynamic loss balancing and sequential enhancement for road-safety assessment and traffic scene classification [0.0]
Road-safety inspection is an indispensable instrument for reducing road-accident fatalities contributed to road infrastructure. Recent work formalizes road-safety assessment in terms of carefully selected risk factors that are also known as road-safety attributes. We propose to reduce dependency on tedious human labor by automating recognition with a two-stage neural architecture.
arXiv Detail & Related papers (2022-11-08T11:10:07Z)
Vision based Pedestrian Potential Risk Analysis based on Automated Behavior Feature Extraction for Smart and Safe City [5.759189800028578]
We propose a comprehensive analytical model for pedestrian potential risk using video footage gathered by road security cameras deployed at such crossings. The proposed system automatically detects vehicles and pedestrians, calculates trajectories by frames, and extracts behavioral features affecting the likelihood of potentially dangerous scenes between these objects. We validated feasibility and applicability by applying it in multiple crosswalks in Osan city, Korea.
arXiv Detail & Related papers (2021-05-06T11:03:10Z)
Safety-Oriented Pedestrian Motion and Scene Occupancy Forecasting [91.69900691029908]
We advocate for predicting both the individual motions as well as the scene occupancy map. We propose a Scene-Actor Graph Neural Network (SA-GNN) which preserves the relative spatial information of pedestrians. On two large-scale real-world datasets, we showcase that our scene-occupancy predictions are more accurate and better calibrated than those from state-of-the-art motion forecasting methods.
arXiv Detail & Related papers (2021-01-07T06:08:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.