EGOFALLS: A visual-audio dataset and benchmark for fall detection using
egocentric cameras
- URL: http://arxiv.org/abs/2309.04579v3
- Date: Thu, 2 Nov 2023 22:41:32 GMT
- Title: EGOFALLS: A visual-audio dataset and benchmark for fall detection using
egocentric cameras
- Authors: Xueyi Wang
- Abstract summary: Falls are significant and often fatal for vulnerable populations such as the elderly.
Previous works have addressed the detection of falls by relying on data capture by a single sensor, images or accelerometers.
In this work, we rely on multimodal descriptors extracted from videos captured by egocentric cameras.
- Score: 0.16317061277456998
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Falls are significant and often fatal for vulnerable populations such as the
elderly. Previous works have addressed the detection of falls by relying on
data capture by a single sensor, images or accelerometers. In this work, we
rely on multimodal descriptors extracted from videos captured by egocentric
cameras. Our proposed method includes a late decision fusion layer that builds
on top of the extracted descriptors. Furthermore, we collect a new dataset on
which we assess our proposed approach. We believe this is the first public
dataset of its kind. The dataset comprises 10,948 video samples by 14 subjects.
We conducted ablation experiments to assess the performance of individual
feature extractors, fusion of visual information, and fusion of both visual and
audio information. Moreover, we experimented with internal and external
cross-validation. Our results demonstrate that the fusion of audio and visual
information through late decision fusion improves detection performance, making
it a promising tool for fall prevention and mitigation.
Related papers
- Practical Video Object Detection via Feature Selection and Aggregation [18.15061460125668]
Video object detection (VOD) needs to concern the high across-frame variation in object appearance, and the diverse deterioration in some frames.
Most of contemporary aggregation methods are tailored for two-stage detectors, suffering from high computational costs.
This study invents a very simple yet potent strategy of feature selection and aggregation, gaining significant accuracy at marginal computational expense.
arXiv Detail & Related papers (2024-07-29T02:12:11Z) - Bayesian Detector Combination for Object Detection with Crowdsourced Annotations [49.43709660948812]
Acquiring fine-grained object detection annotations in unconstrained images is time-consuming, expensive, and prone to noise.
We propose a novel Bayesian Detector Combination (BDC) framework to more effectively train object detectors with noisy crowdsourced annotations.
BDC is model-agnostic, requires no prior knowledge of the annotators' skill level, and seamlessly integrates with existing object detection models.
arXiv Detail & Related papers (2024-07-10T18:00:54Z) - Towards Viewpoint Robustness in Bird's Eye View Segmentation [85.99907496019972]
We study how AV perception models are affected by changes in camera viewpoint.
Small changes to pitch, yaw, depth, or height of the camera at inference time lead to large drops in performance.
We introduce a technique for novel view synthesis and use it to transform collected data to the viewpoint of target rigs.
arXiv Detail & Related papers (2023-09-11T02:10:07Z) - An Outlier Exposure Approach to Improve Visual Anomaly Detection
Performance for Mobile Robots [76.36017224414523]
We consider the problem of building visual anomaly detection systems for mobile robots.
Standard anomaly detection models are trained using large datasets composed only of non-anomalous data.
We tackle the problem of exploiting these data to improve the performance of a Real-NVP anomaly detection model.
arXiv Detail & Related papers (2022-09-20T15:18:13Z) - Fall detection using multimodal data [1.8149327897427234]
This paper studies the fall detection problem based on a large public dataset, namely the UP-Fall Detection dataset.
We propose several techniques to obtain valuable features from these sensors and cameras and then construct suitable models for the main problem.
arXiv Detail & Related papers (2022-05-12T07:13:34Z) - Egocentric Human-Object Interaction Detection Exploiting Synthetic Data [19.220651860718892]
We consider the problem of detecting Egocentric HumanObject Interactions (EHOIs) in industrial contexts.
We propose a pipeline and a tool to generate photo-realistic synthetic First Person Vision (FPV) images automatically labeled for EHOI detection.
arXiv Detail & Related papers (2022-04-14T15:59:15Z) - Target-aware Dual Adversarial Learning and a Multi-scenario
Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection [65.30079184700755]
This study addresses the issue of fusing infrared and visible images that appear differently for object detection.
Previous approaches discover commons underlying the two modalities and fuse upon the common space either by iterative optimization or deep networks.
This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network.
arXiv Detail & Related papers (2022-03-30T11:44:56Z) - Multi-Perspective Anomaly Detection [3.3511723893430476]
We build upon the deep support vector data description algorithm and address multi-perspective anomaly detection.
We employ different augmentation techniques with a denoising process to deal with scarce one-class data.
We evaluate our approach on the new dices dataset using images from two different perspectives and also benchmark on the standard MNIST dataset.
arXiv Detail & Related papers (2021-05-20T17:07:36Z) - Data Augmentation for Object Detection via Differentiable Neural
Rendering [71.00447761415388]
It is challenging to train a robust object detector when annotated data is scarce.
Existing approaches to tackle this problem include semi-supervised learning that interpolates labeled data from unlabeled data.
We introduce an offline data augmentation method for object detection, which semantically interpolates the training data with novel views.
arXiv Detail & Related papers (2021-03-04T06:31:06Z) - Ensembling object detectors for image and video data analysis [98.26061123111647]
We propose a method for ensembling the outputs of multiple object detectors for improving detection performance and precision of bounding boxes on image data.
We extend it to video data by proposing a two-stage tracking-based scheme for detection refinement.
arXiv Detail & Related papers (2021-02-09T12:38:16Z) - Multi-Modal Fingerprint Presentation Attack Detection: Evaluation On A
New Dataset [9.783887684870654]
Fingerprint presentation attack detection is becoming an increasingly challenging problem.
We study the usefulness of multiple recently introduced sensing modalities.
We conducted a comprehensive analysis using a fully convolutional deep neural network framework.
arXiv Detail & Related papers (2020-06-12T22:38:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.