Scene Graph-Guided Generative AI Framework for Synthesizing and Evaluating Industrial Hazard Scenarios
- URL: http://arxiv.org/abs/2511.13970v1
- Date: Mon, 17 Nov 2025 22:58:27 GMT
- Title: Scene Graph-Guided Generative AI Framework for Synthesizing and Evaluating Industrial Hazard Scenarios
- Authors: Sanjay Acharjee, Abir Khan Ratul, Diego Patino, Md Nazmus Sakib,
- Abstract summary: Training vision models to detect workplace hazards accurately requires realistic images of unsafe conditions that could lead to accidents.<n>This study presents a novel scene graph-guided generative AI framework that synthesizes photorealistic images of hazardous scenarios grounded in historical Occupational Safety and Health Administration (OSHA) accident reports.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training vision models to detect workplace hazards accurately requires realistic images of unsafe conditions that could lead to accidents. However, acquiring such datasets is difficult because capturing accident-triggering scenarios as they occur is nearly impossible. To overcome this limitation, this study presents a novel scene graph-guided generative AI framework that synthesizes photorealistic images of hazardous scenarios grounded in historical Occupational Safety and Health Administration (OSHA) accident reports. OSHA narratives are analyzed using GPT-4o to extract structured hazard reasoning, which is converted into object-level scene graphs capturing spatial and contextual relationships essential for understanding risk. These graphs guide a text-to-image diffusion model to generate compositionally accurate hazard scenes. To evaluate the realism and semantic fidelity of the generated data, a visual question answering (VQA) framework is introduced. Across four state-of-the-art generative models, the proposed VQA Graph Score outperforms CLIP and BLIP metrics based on entropy-based validation, confirming its higher discriminative sensitivity.
Related papers
- Learning to Explore: Policy-Guided Outlier Synthesis for Graph Out-of-Distribution Detection [51.93878677594561]
In unsupervised graph-level OOD detection, models are typically trained using only in-distribution (ID) data.<n>We propose a Policy-Guided Outlier Synthesis framework that replaces statics with a learned exploration strategy.
arXiv Detail & Related papers (2026-02-28T11:40:18Z) - Toward Autonomous Laboratory Safety Monitoring with Vision Language Models: Learning to See Hazards Through Scene Structure [26.434430112145137]
Laboratories are prone to severe injuries from minor unsafe actions.<n> continuous safety monitoring is limited by human availability.<n>Vision language models (VLMs) offer promise for autonomous laboratory safety monitoring.
arXiv Detail & Related papers (2026-01-31T00:08:41Z) - Associative Poisoning to Generative Machine Learning [5.094623170336122]
We introduce a novel data poisoning technique called associative poisoning.<n>It compromises fine-grained features of the generated data without requiring control of the training process.<n>This attack perturbs only the training data to manipulate statistical associations between specific feature pairs in the generated outputs.
arXiv Detail & Related papers (2025-11-07T11:47:33Z) - Towards a Multi-Agent Vision-Language System for Zero-Shot Novel Hazardous Object Detection for Autonomous Driving Safety [0.0]
We propose a multimodal approach that integrates vision-language reasoning with zero-shot object detection.<n>We refine object detection by incorporating OpenAI's CLIP model to match predicted hazards with bounding box annotations.<n>Our findings highlight the strengths and limitations of current vision-language-based approaches.
arXiv Detail & Related papers (2025-04-18T01:25:02Z) - Ensuring Medical AI Safety: Interpretability-Driven Detection and Mitigation of Spurious Model Behavior and Associated Data [14.991686165405959]
We show the applicability of the framework using four medical datasets across two modalities.<n>We successfully identify and unlearn these biases in VGG16, ResNet50, and contemporary Vision Transformer models.
arXiv Detail & Related papers (2025-01-23T16:39:09Z) - Epistemic Uncertainty for Generated Image Detection [107.62647907393377]
We introduce a novel framework for AI-generated image detection through epistemic uncertainty, aiming to address critical security concerns in the era of generative models.<n>Our key insight stems from the observation that distributional discrepancies between training and testing data manifest distinctively in the epistemic uncertainty space of machine learning models.
arXiv Detail & Related papers (2024-12-08T11:32:25Z) - EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.<n>Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.<n>However, the deployment of these agents in physical environments presents significant safety challenges.<n>This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - BadGD: A unified data-centric framework to identify gradient descent vulnerabilities [10.996626204702189]
BadGD sets a new standard for understanding and mitigating adversarial manipulations.
This research underscores the severe threats posed by such data-centric attacks and highlights the urgent need for robust defenses in machine learning.
arXiv Detail & Related papers (2024-05-24T23:39:45Z) - Data-Agnostic Model Poisoning against Federated Learning: A Graph
Autoencoder Approach [65.2993866461477]
This paper proposes a data-agnostic, model poisoning attack on Federated Learning (FL)
The attack requires no knowledge of FL training data and achieves both effectiveness and undetectability.
Experiments show that the FL accuracy drops gradually under the proposed attack and existing defense mechanisms fail to detect it.
arXiv Detail & Related papers (2023-11-30T12:19:10Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - Data-driven Design of Context-aware Monitors for Hazard Prediction in
Artificial Pancreas Systems [2.126171264016785]
Medical Cyber-physical Systems (MCPS) are vulnerable to accidental or malicious faults that can target their controllers and cause safety hazards and harm to patients.
This paper proposes a combined model and data-driven approach for designing context-aware monitors that can detect early signs of hazards and mitigate them.
arXiv Detail & Related papers (2021-04-06T14:36:33Z) - Proactive Pseudo-Intervention: Causally Informed Contrastive Learning
For Interpretable Vision Models [103.64435911083432]
We present a novel contrastive learning strategy called it Proactive Pseudo-Intervention (PPI)
PPI leverages proactive interventions to guard against image features with no causal relevance.
We also devise a novel causally informed salience mapping module to identify key image pixels to intervene, and show it greatly facilitates model interpretability.
arXiv Detail & Related papers (2020-12-06T20:30:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.