GSO-YOLO: Global Stability Optimization YOLO for Construction Site Detection
- URL: http://arxiv.org/abs/2407.00906v1
- Date: Mon, 1 Jul 2024 02:15:27 GMT
- Title: GSO-YOLO: Global Stability Optimization YOLO for Construction Site Detection
- Authors: Yuming Zhang, Dongzhi Guan, Shouxin Zhang, Junhao Su, Yunzhi Han, Jiabin Liu,
- Abstract summary: This study presents the Global Stability Optimization YOLO (GSO-YOLO) model to address challenges in complex construction sites.
The model integrates the Global Optimization Module (GOM) and Steady Capture Module (SCM) to enhance global contextual information capture and detection stability.
Experiments on datasets like SODA, MOCS, and CIS show that GSO-YOLO outperforms existing methods, achieving SOTA performance.
- Score: 4.2114456503277315
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Safety issues at construction sites have long plagued the industry, posing risks to worker safety and causing economic damage due to potential hazards. With the advancement of artificial intelligence, particularly in the field of computer vision, the automation of safety monitoring on construction sites has emerged as a solution to this longstanding issue. Despite achieving impressive performance, advanced object detection methods like YOLOv8 still face challenges in handling the complex conditions found at construction sites. To solve these problems, this study presents the Global Stability Optimization YOLO (GSO-YOLO) model to address challenges in complex construction sites. The model integrates the Global Optimization Module (GOM) and Steady Capture Module (SCM) to enhance global contextual information capture and detection stability. The innovative AIoU loss function, which combines CIoU and EIoU, improves detection accuracy and efficiency. Experiments on datasets like SODA, MOCS, and CIS show that GSO-YOLO outperforms existing methods, achieving SOTA performance.
Related papers
- CIB-SE-YOLOv8: Optimized YOLOv8 for Real-Time Safety Equipment Detection on Construction Sites [4.028949797830281]
This study presents a computer vision-based solution using YOLO for real-time helmet detection.
Our proposed CIB-SE-YOLOv8 model incorporates SE attention mechanisms and modified C2f blocks, enhancing detection accuracy and efficiency.
arXiv Detail & Related papers (2024-10-28T03:07:03Z) - An Adaptive End-to-End IoT Security Framework Using Explainable AI and LLMs [1.9662978733004601]
This paper presents an innovative framework for real-time IoT attack detection and response that leverages Machine Learning (ML), Explainable AI (XAI), and Large Language Models (LLM)
Our end-to-end framework not only facilitates a seamless transition from model development to deployment but also represents a real-world application capability that is often lacking in existing research.
arXiv Detail & Related papers (2024-09-20T03:09:23Z) - DS MYOLO: A Reliable Object Detector Based on SSMs for Driving Scenarios [2.615648035076649]
We propose a novel real-time object detector, DS MYOLO, inspired by Mamba's outstanding performance.
This detector captures global feature information through a simplified selective scanning fusion block (SimVSS Block) and effectively integrates the network's deep features.
Experiments on the CCTSDB 2021 and VLD-45 driving scenarios demonstrate that DS MYOLO exhibits significant potential and competitive advantage.
arXiv Detail & Related papers (2024-09-02T09:22:33Z) - EAIRiskBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [47.69642609574771]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.
Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.
However, the deployment of these agents in physical environments presents significant safety challenges.
This study introduces EAIRiskBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - Enhanced Model Robustness to Input Corruptions by Per-corruption Adaptation of Normalization Statistics [22.876222327262596]
We introduce Per-corruption Adaptation of Normalization statistics (PAN) to enhance the model robustness of vision systems.
Our approach entails three key components: (i) a corruption type identification module, (ii) dynamic adjustment of normalization layer statistics based on identified corruption type, and (iii) real-time update of these statistics according to input data.
arXiv Detail & Related papers (2024-07-08T23:20:18Z) - Mamba YOLO: SSMs-Based YOLO For Object Detection [9.879086222226617]
Mamba-YOLO is a novel object detection model based on State Space Models.
We show that Mamba-YOLO surpasses the existing YOLO series models in both performance and competitiveness.
arXiv Detail & Related papers (2024-06-09T15:56:19Z) - Generative AI Agents with Large Language Model for Satellite Networks via a Mixture of Experts Transmission [74.10928850232717]
This paper develops generative artificial intelligence (AI) agents for model formulation and then applies a mixture of experts (MoE) to design transmission strategies.
Specifically, we leverage large language models (LLMs) to build an interactive modeling paradigm.
We propose an MoE-proximal policy optimization (PPO) approach to solve the formulated problem.
arXiv Detail & Related papers (2024-04-14T03:44:54Z) - Highlighting the Safety Concerns of Deploying LLMs/VLMs in Robotics [54.57914943017522]
We highlight the critical issues of robustness and safety associated with integrating large language models (LLMs) and vision-language models (VLMs) into robotics applications.
arXiv Detail & Related papers (2024-02-15T22:01:45Z) - Filling the Missing: Exploring Generative AI for Enhanced Federated
Learning over Heterogeneous Mobile Edge Devices [72.61177465035031]
We propose a generative AI-empowered federated learning to address these challenges by leveraging the idea of FIlling the MIssing (FIMI) portion of local data.
Experiment results demonstrate that FIMI can save up to 50% of the device-side energy to achieve the target global test accuracy.
arXiv Detail & Related papers (2023-10-21T12:07:04Z) - Robust Single Image Dehazing Based on Consistent and Contrast-Assisted
Reconstruction [95.5735805072852]
We propose a novel density-variational learning framework to improve the robustness of the image dehzing model.
Specifically, the dehazing network is optimized under the consistency-regularized framework.
Our method significantly surpasses the state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-29T08:11:04Z) - Interpretable Hyperspectral AI: When Non-Convex Modeling meets
Hyperspectral Remote Sensing [57.52865154829273]
Hyperspectral imaging, also known as image spectrometry, is a landmark technique in geoscience remote sensing (RS)
In the past decade efforts have been made to process analyze these hyperspectral (HS) products mainly by means of seasoned experts.
For this reason, it is urgent to develop more intelligent and automatic approaches for various HS RS applications.
arXiv Detail & Related papers (2021-03-02T03:32:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.