Psych-Occlusion: Using Visual Psychophysics for Aerial Detection of Occluded Persons during Search and Rescue
- URL: http://arxiv.org/abs/2412.05553v1
- Date: Sat, 07 Dec 2024 06:22:42 GMT
- Title: Psych-Occlusion: Using Visual Psychophysics for Aerial Detection of Occluded Persons during Search and Rescue
- Authors: Arturo Miguel Russell Bernal, Jane Cleland-Huang, Walter Scheirer,
- Abstract summary: Small Unmanned Aerial Systems (sUAS) as "eyes in the sky" during Emergency Response (ER) scenarios.
efficient detection of persons from aerial views plays a crucial role in achieving a successful mission outcome.
Performance of Computer Vision (CV) models onboard sUAS substantially degrades under real-life rigorous conditions.
We exemplify the use of our behavioral dataset, Psych-ER, by using its human accuracy data to adapt the loss function of a detection model.
- Score: 41.03292974500013
- License:
- Abstract: The success of Emergency Response (ER) scenarios, such as search and rescue, is often dependent upon the prompt location of a lost or injured person. With the increasing use of small Unmanned Aerial Systems (sUAS) as "eyes in the sky" during ER scenarios, efficient detection of persons from aerial views plays a crucial role in achieving a successful mission outcome. Fatigue of human operators during prolonged ER missions, coupled with limited human resources, highlights the need for sUAS equipped with Computer Vision (CV) capabilities to aid in finding the person from aerial views. However, the performance of CV models onboard sUAS substantially degrades under real-life rigorous conditions of a typical ER scenario, where person search is hampered by occlusion and low target resolution. To address these challenges, we extracted images from the NOMAD dataset and performed a crowdsource experiment to collect behavioural measurements when humans were asked to "find the person in the picture". We exemplify the use of our behavioral dataset, Psych-ER, by using its human accuracy data to adapt the loss function of a detection model. We tested our loss adaptation on a RetinaNet model evaluated on NOMAD against increasing distance and occlusion, with our psychophysical loss adaptation showing improvements over the baseline at higher distances across different levels of occlusion, without degrading performance at closer distances. To the best of our knowledge, our work is the first human-guided approach to address the location task of a detection model, while addressing real-world challenges of aerial search and rescue. All datasets and code can be found at: https://github.com/ArtRuss/NOMAD.
Related papers
- Human Body Restoration with One-Step Diffusion Model and A New Benchmark [74.66514054623669]
We propose a high-quality dataset automated cropping and filtering (HQ-ACF) pipeline.
This pipeline leverages existing object detection datasets and other unlabeled images to automatically crop and filter high-quality human images.
We also propose emphOSDHuman, a novel one-step diffusion model for human body restoration.
arXiv Detail & Related papers (2025-02-03T14:48:40Z) - UAV-Enhanced Combination to Application: Comprehensive Analysis and Benchmarking of a Human Detection Dataset for Disaster Scenarios [3.759682200711633]
Unmanned aerial vehicles (UAVs) have revolutionized search and rescue (SAR) operations.
The lack of specialized human detection datasets for training machine learning models poses a significant challenge.
This paper introduces the Combination to Application (C2A) dataset, synthesized by overlaying human poses onto UAV-captured disaster scenes.
arXiv Detail & Related papers (2024-08-09T08:07:19Z) - AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation [55.179287851188036]
We introduce a novel all-in-one-stage framework, AiOS, for expressive human pose and shape recovery without an additional human detection step.
We first employ a human token to probe a human location in the image and encode global features for each instance.
Then, we introduce a joint-related token to probe the human joint in the image and encoder a fine-grained local feature.
arXiv Detail & Related papers (2024-03-26T17:59:23Z) - NOMAD: A Natural, Occluded, Multi-scale Aerial Dataset, for Emergency Response Scenarios [41.03292974500013]
Natural, Occluded, Multi-scale Aerial dataset (NOMAD) is a benchmark dataset for human detection under occluded aerial views.
NOMAD is composed of 100 different Actors, all performing sequences of walking, laying and hiding.
It includes 42,825 frames, extracted from 5.4k resolution videos, and manually annotated with a bounding box and a label describing 10 different visibility levels.
arXiv Detail & Related papers (2023-09-18T06:57:00Z) - Using Features at Multiple Temporal and Spatial Resolutions to Predict
Human Behavior in Real Time [2.955419572714387]
We present an approach for integrating high and low-resolution spatial and temporal information to predict human behavior in real time.
Our model composes neural networks for high and low-resolution feature extraction with a neural network for behavior prediction, with all three networks trained simultaneously.
arXiv Detail & Related papers (2022-11-12T18:41:33Z) - An Outlier Exposure Approach to Improve Visual Anomaly Detection
Performance for Mobile Robots [76.36017224414523]
We consider the problem of building visual anomaly detection systems for mobile robots.
Standard anomaly detection models are trained using large datasets composed only of non-anomalous data.
We tackle the problem of exploiting these data to improve the performance of a Real-NVP anomaly detection model.
arXiv Detail & Related papers (2022-09-20T15:18:13Z) - Aerial View Goal Localization with Reinforcement Learning [6.165163123577484]
We present a framework that emulates a search-and-rescue (SAR)-like setup without requiring access to actual UAVs.
In this framework, an agent operates on top of an aerial image (proxy for a search area) and is tasked with localizing a goal that is described in terms of visual cues.
We propose AiRLoc, a reinforcement learning (RL)-based model that decouples exploration (searching for distant goals) and exploitation (localizing nearby goals)
arXiv Detail & Related papers (2022-09-08T10:27:53Z) - Rethinking Drone-Based Search and Rescue with Aerial Person Detection [79.76669658740902]
The visual inspection of aerial drone footage is an integral part of land search and rescue (SAR) operations today.
We propose a novel deep learning algorithm to automate this aerial person detection (APD) task.
We present the novel Aerial Inspection RetinaNet (AIR) algorithm as the combination of these contributions.
arXiv Detail & Related papers (2021-11-17T21:48:31Z) - Batch Exploration with Examples for Scalable Robotic Reinforcement
Learning [63.552788688544254]
Batch Exploration with Examples (BEE) explores relevant regions of the state-space guided by a modest number of human provided images of important states.
BEE is able to tackle challenging vision-based manipulation tasks both in simulation and on a real Franka robot.
arXiv Detail & Related papers (2020-10-22T17:49:25Z) - DeepSOCIAL: Social Distancing Monitoring and Infection Risk Assessment
in COVID-19 Pandemic [1.027974860479791]
Social distancing is a recommended solution by the World Health Organisation (WHO) to minimise the spread of COVID-19 in public places.
We develop a hybrid Computer Vision and YOLOv4-based Deep Neural Network model for automated people detection in the crowd using common CCTV cameras.
The developed model is a generic and accurate people detection and tracking solution that can be applied in many other fields.
arXiv Detail & Related papers (2020-08-26T16:56:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.