Related papers: Monocular 2D Camera-based Proximity Monitoring for Human-Machine Collision Warning on Construction Sites

Monocular 2D Camera-based Proximity Monitoring for Human-Machine Collision Warning on Construction Sites

URL: http://arxiv.org/abs/2305.17931v2
Date: Thu, 19 Oct 2023 16:48:41 GMT
Title: Monocular 2D Camera-based Proximity Monitoring for Human-Machine Collision Warning on Construction Sites
Authors: Yuexiong Ding, Xiaowei Luo
Abstract summary: Accident of struck-by machines is one of the leading causes of casualties on construction sites. Monitoring workers' proximities to avoid human-machine collisions has aroused great concern in construction safety management. This study proposes a novel framework for proximity monitoring using only an ordinary 2D camera to realize real-time human-machine collision warning.
Score: 1.7223564681760168
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accident of struck-by machines is one of the leading causes of casualties on construction sites. Monitoring workers' proximities to avoid human-machine collisions has aroused great concern in construction safety management. Existing methods are either too laborious and costly to apply extensively, or lacking spatial perception for accurate monitoring. Therefore, this study proposes a novel framework for proximity monitoring using only an ordinary 2D camera to realize real-time human-machine collision warning, which is designed to integrate a monocular 3D object detection model to perceive spatial information from 2D images and a post-processing classification module to identify the proximity as four predefined categories: Dangerous, Potentially Dangerous, Concerned, and Safe. A virtual dataset containing 22000 images with 3D annotations is constructed and publicly released to facilitate the system development and evaluation. Experimental results show that the trained 3D object detection model achieves 75% loose AP within 20 meters. Besides, the implemented system is real-time and camera carrier-independent, achieving an F1 of roughly 0.8 within 50 meters under specified settings for machines of different sizes. This study preliminarily reveals the potential and feasibility of proximity monitoring using only a 2D camera, providing a new promising and economical way for early warning of human-machine collisions.

Related papers

2.5D Object Detection for Intelligent Roadside Infrastructure [37.07785188366053]
We introduce a 2.5D object detection framework for infrastructure roadside-mounted cameras.<n>We employ a prediction approach to detect ground planes of vehicles as parallelograms in the image frame.<n>Our results show high detection accuracy, strong cross-viewpoint generalization, and robustness to diverse lighting and weather conditions.
arXiv Detail & Related papers (2025-07-04T13:16:59Z)
The Meeseeks Mesh: Spatially Consistent 3D Adversarial Objects for BEV Detector [37.74333887056029]
3D object detection is a critical component in autonomous driving systems.<n>In this paper, we investigate the vulnerability of 3D object detection models to 3D adversarial attacks.<n>We generate non-invasive 3D adversarial objects tailored for real-world attack scenarios.
arXiv Detail & Related papers (2025-05-28T15:49:54Z)
Sixth-Sense: Self-Supervised Learning of Spatial Awareness of Humans from a Planar Lidar [47.992786505913955]
Most commercially available service robots are equipped with cameras with a narrow field of view, making them blind when a user is approaching from other directions. We propose a self-supervised approach to detect humans and estimate their 2D pose from 1D LiDAR data, using detections from an RGB-D camera as a supervision source. Our model is capable of detecting humans omnidirectionally from 1D LiDAR data in a novel environment, with 71% precision and 80% recall, while retaining an average absolute error of 13 cm in distance and 44deg in orientation.
arXiv Detail & Related papers (2025-02-28T13:22:12Z)
Online Collision Risk Estimation via Monocular Depth-Aware Object Detectors and Fuzzy Inference [6.856508678236828]
The framework takes two sets of predictions produced by different algorithms and associates their inconsistencies with the collision risk via fuzzy inference. We experimentally validate that, based on Intersection-over-Union (IoU) and a depth discrepancy measure, the inconsistencies between the two sets of predictions strongly correlate to the safety-related error of the 3D object detector.
arXiv Detail & Related papers (2024-11-09T20:20:36Z)
Uncertainty Estimation for 3D Object Detection via Evidential Learning [63.61283174146648]
We introduce a framework for quantifying uncertainty in 3D object detection by leveraging an evidential learning loss on Bird's Eye View representations in the 3D detector. We demonstrate both the efficacy and importance of these uncertainty estimates on identifying out-of-distribution scenes, poorly localized objects, and missing (false negative) detections.
arXiv Detail & Related papers (2024-10-31T13:13:32Z)
Benchmarking Robustness of 3D Object Detection to Common Corruptions in Autonomous Driving [44.753797839280516]
Existing 3D detectors lack robustness to real-world corruptions caused by adverse weathers, sensor noises, etc. We benchmark 27 types of common corruptions for both LiDAR and camera inputs considering real-world driving scenarios. We conduct large-scale experiments on 24 diverse 3D object detection models to evaluate their robustness.
arXiv Detail & Related papers (2023-03-20T11:45:54Z)
Aerial Monocular 3D Object Detection [67.20369963664314]
DVDET is proposed to achieve aerial monocular 3D object detection in both the 2D image space and the 3D physical space. To address the severe view deformation issue, we propose a novel trainable geo-deformable transformation module. To encourage more researchers to investigate this area, we will release the dataset and related code.
arXiv Detail & Related papers (2022-08-08T08:32:56Z)
Homography Loss for Monocular 3D Object Detection [54.04870007473932]
A differentiable loss function, termed as Homography Loss, is proposed to achieve the goal, which exploits both 2D and 3D information. Our method yields the best performance compared with the other state-of-the-arts by a large margin on KITTI 3D datasets.
arXiv Detail & Related papers (2022-04-02T03:48:03Z)
Exploring Adversarial Robustness of Multi-Sensor Perception Systems in Self Driving [87.3492357041748]
In this paper, we showcase practical susceptibilities of multi-sensor detection by placing an adversarial object on top of a host vehicle. Our experiments demonstrate that successful attacks are primarily caused by easily corrupted image features. Towards more robust multi-modal perception systems, we show that adversarial training with feature denoising can boost robustness to such attacks significantly.
arXiv Detail & Related papers (2021-01-17T21:15:34Z)
Detecting Invisible People [58.49425715635312]
We re-purpose tracking benchmarks and propose new metrics for the task of detecting invisible objects. We demonstrate that current detection and tracking systems perform dramatically worse on this task. Second, we build dynamic models that explicitly reason in 3D, making use of observations produced by state-of-the-art monocular depth estimation networks.
arXiv Detail & Related papers (2020-12-15T16:54:45Z)
Perceiving Humans: from Monocular 3D Localization to Social Distancing [93.03056743850141]
We present a new cost-effective vision-based method that perceives humans' locations in 3D and their body orientation from a single image. We show that it is possible to rethink the concept of "social distancing" as a form of social interaction in contrast to a simple location-based rule.
arXiv Detail & Related papers (2020-09-01T10:12:30Z)
Inter-Homines: Distance-Based Risk Estimation for Human Safety [44.266630835933434]
Our system evaluates in real-time the contagion risk in a monitored area by analyzing video streams. It is able to locate people in 3D space, calculate distances and predict risk levels. Inter-Ho-mines works both indoor and outdoor, in public and private crowded areas.
arXiv Detail & Related papers (2020-07-20T16:32:27Z)
Training-free Monocular 3D Event Detection System for Traffic Surveillance [93.65240041833319]
Existing event detection systems are mostly learning-based and have achieved convincing performance when a large amount of training data is available. In real-world scenarios, collecting sufficient labeled training data is expensive and sometimes impossible. We propose a training-free monocular 3D event detection system for traffic surveillance.
arXiv Detail & Related papers (2020-02-01T04:42:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.