Performance of object recognition in wearable videos
- URL: http://arxiv.org/abs/2009.04932v1
- Date: Thu, 10 Sep 2020 15:20:17 GMT
- Title: Performance of object recognition in wearable videos
- Authors: Alberto Sabater, Luis Montesano, Ana C. Murillo
- Abstract summary: This work studies the problem of object detection and localization on videos captured by this type of camera.
We present a study of the well known YOLO architecture, that offers an excellent trade-off between accuracy and speed.
- Score: 9.669942356088377
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Wearable technologies are enabling plenty of new applications of computer
vision, from life logging to health assistance. Many of them are required to
recognize the elements of interest in the scene captured by the camera. This
work studies the problem of object detection and localization on videos
captured by this type of camera. Wearable videos are a much more challenging
scenario for object detection than standard images or even another type of
videos, due to lower quality images (e.g. poor focus) or high clutter and
occlusion common in wearable recordings. Existing work typically focuses on
detecting the objects of focus or those being manipulated by the user wearing
the camera. We perform a more general evaluation of the task of object
detection in this type of video, because numerous applications, such as
marketing studies, also need detecting objects which are not in focus by the
user. This work presents a thorough study of the well known YOLO architecture,
that offers an excellent trade-off between accuracy and speed, for the
particular case of object detection in wearable video. We focus our study on
the public ADL Dataset, but we also use additional public data for
complementary evaluations. We run an exhaustive set of experiments with
different variations of the original architecture and its training strategy.
Our experiments drive to several conclusions about the most promising
directions for our goal and point us to further research steps to improve
detection in wearable videos.
Related papers
- FADE: A Dataset for Detecting Falling Objects around Buildings in Video [75.48118923174712]
Falling objects from buildings can cause severe injuries to pedestrians due to the great impact force they exert.
FADE contains 1,881 videos from 18 scenes, featuring 8 falling object categories, 4 weather conditions, and 4 video resolutions.
We develop a new object detection method called FADE-Net, which effectively leverages motion information.
arXiv Detail & Related papers (2024-08-11T11:43:56Z) - Empowering Visually Impaired Individuals: A Novel Use of Apple Live
Photos and Android Motion Photos [3.66237529322911]
We advocate for the use of Apple Live Photos and Android Motion Photos technologies.
Our findings reveal that both Live Photos and Motion Photos outperform single-frame images in common visual assisting tasks.
arXiv Detail & Related papers (2023-09-14T20:46:35Z) - Ensemble Learning techniques for object detection in high-resolution
satellite images [0.0]
Ensembling is a method that aims to maximize the detection performance by fusing individual detectors.
Ensembling methods have been widely used to achieve high scores in recent data science com-petitions, such as Kaggle.
arXiv Detail & Related papers (2022-02-16T10:19:21Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - Deep Learning Approach Protecting Privacy in Camera-Based Critical
Applications [57.93313928219855]
We propose a deep learning approach towards protecting privacy in camera-based systems.
Our technique distinguishes between salient (visually prominent) and non-salient objects based on the intuition that the latter is unlikely to be needed by the application.
arXiv Detail & Related papers (2021-10-04T19:16:27Z) - ASOD60K: Audio-Induced Salient Object Detection in Panoramic Videos [79.05486554647918]
We propose PV-SOD, a new task that aims to segment salient objects from panoramic videos.
In contrast to existing fixation-level or object-level saliency detection tasks, we focus on multi-modal salient object detection (SOD)
We collect the first large-scale dataset, named ASOD60K, which contains 4K-resolution video frames annotated with a six-level hierarchy.
arXiv Detail & Related papers (2021-07-24T15:14:20Z) - Learning to Track Object Position through Occlusion [32.458623495840904]
Occlusion is one of the most significant challenges encountered by object detectors and trackers.
We propose a tracking-by-detection approach that builds upon the success of region based video object detectors.
Our approach achieves superior results on a dataset of furniture assembly videos collected from the internet.
arXiv Detail & Related papers (2021-06-20T22:29:46Z) - Few-Shot Learning for Video Object Detection in a Transfer-Learning
Scheme [70.45901040613015]
We study the new problem of few-shot learning for video object detection.
We employ a transfer-learning framework to effectively train the video object detector on a large number of base-class objects and a few video clips of novel-class objects.
arXiv Detail & Related papers (2021-03-26T20:37:55Z) - A Simple and Effective Use of Object-Centric Images for Long-Tailed
Object Detection [56.82077636126353]
We take advantage of object-centric images to improve object detection in scene-centric images.
We present a simple yet surprisingly effective framework to do so.
Our approach can improve the object detection (and instance segmentation) accuracy of rare objects by 50% (and 33%) relatively.
arXiv Detail & Related papers (2021-02-17T17:27:21Z) - Robust and efficient post-processing for video object detection [9.669942356088377]
This work introduces a novel post-processing pipeline that overcomes some of the limitations of previous post-processing methods.
Our method improves the results of state-of-the-art specific video detectors, specially regarding fast moving objects.
And applied to efficient still image detectors, such as YOLO, provides comparable results to much more computationally intensive detectors.
arXiv Detail & Related papers (2020-09-23T10:47:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.