Benchmarking 2D Egocentric Hand Pose Datasets
- URL: http://arxiv.org/abs/2409.07337v1
- Date: Wed, 11 Sep 2024 15:18:11 GMT
- Title: Benchmarking 2D Egocentric Hand Pose Datasets
- Authors: Olga Taran, Damian M. Manzone, Jose Zariffa,
- Abstract summary: Hand pose estimation from egocentric video has broad implications across various domains.
This work is devoted to the analysis of state-of-the-art egocentric datasets suitable for 2D hand pose estimation.
- Score: 1.611271868398988
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Hand pose estimation from egocentric video has broad implications across various domains, including human-computer interaction, assistive technologies, activity recognition, and robotics, making it a topic of significant research interest. The efficacy of modern machine learning models depends on the quality of data used for their training. Thus, this work is devoted to the analysis of state-of-the-art egocentric datasets suitable for 2D hand pose estimation. We propose a novel protocol for dataset evaluation, which encompasses not only the analysis of stated dataset characteristics and assessment of data quality, but also the identification of dataset shortcomings through the evaluation of state-of-the-art hand pose estimation models. Our study reveals that despite the availability of numerous egocentric databases intended for 2D hand pose estimation, the majority are tailored for specific use cases. There is no ideal benchmark dataset yet; however, H2O and GANerated Hands datasets emerge as the most promising real and synthetic datasets, respectively.
Related papers
- Data Augmentation in Human-Centric Vision [54.97327269866757]
This survey presents a comprehensive analysis of data augmentation techniques in human-centric vision tasks.
It delves into a wide range of research areas including person ReID, human parsing, human pose estimation, and pedestrian detection.
Our work categorizes data augmentation methods into two main types: data generation and data perturbation.
arXiv Detail & Related papers (2024-03-13T16:05:18Z) - Assessing Dataset Quality Through Decision Tree Characteristics in
Autoencoder-Processed Spaces [0.30458514384586394]
We show the profound impact of dataset quality on model training and performance.
Our findings underscore the importance of appropriate feature selection, adequate data volume, and data quality.
This research offers valuable insights into data assessment practices, contributing to the development of more accurate and robust machine learning models.
arXiv Detail & Related papers (2023-06-27T11:33:31Z) - Exploring the Effectiveness of Dataset Synthesis: An application of
Apple Detection in Orchards [68.95806641664713]
We explore the usability of Stable Diffusion 2.1-base for generating synthetic datasets of apple trees for object detection.
We train a YOLOv5m object detection model to predict apples in a real-world apple detection dataset.
Results demonstrate that the model trained on generated data is slightly underperforming compared to a baseline model trained on real-world images.
arXiv Detail & Related papers (2023-06-20T09:46:01Z) - Video-based Pose-Estimation Data as Source for Transfer Learning in
Human Activity Recognition [71.91734471596433]
Human Activity Recognition (HAR) using on-body devices identifies specific human actions in unconstrained environments.
Previous works demonstrated that transfer learning is a good strategy for addressing scenarios with scarce data.
This paper proposes using datasets intended for human-pose estimation as a source for transfer learning.
arXiv Detail & Related papers (2022-12-02T18:19:36Z) - Data-SUITE: Data-centric identification of in-distribution incongruous
examples [81.21462458089142]
Data-SUITE is a data-centric framework to identify incongruous regions of in-distribution (ID) data.
We empirically validate Data-SUITE's performance and coverage guarantees.
arXiv Detail & Related papers (2022-02-17T18:58:31Z) - Homogenization of Existing Inertial-Based Datasets to Support Human
Activity Recognition [8.076841611508486]
Several techniques have been proposed to address the problem of recognizing activities of daily living from signals.
Deep learning techniques applied to inertial signals have proven to be effective, achieving significant classification accuracy.
Research in human activity recognition models has been almost totally model-centric.
arXiv Detail & Related papers (2022-01-17T14:29:48Z) - Exploring the Efficacy of Automatically Generated Counterfactuals for
Sentiment Analysis [17.811597734603144]
We propose an approach to automatically generating counterfactual data for data augmentation and explanation.
A comprehensive evaluation on several different datasets and using a variety of state-of-the-art benchmarks demonstrate how our approach can achieve significant improvements in model performance.
arXiv Detail & Related papers (2021-06-29T10:27:01Z) - Deep Learning-Based Human Pose Estimation: A Survey [66.01917727294163]
Human pose estimation has drawn increasing attention during the past decade.
It has been utilized in a wide range of applications including human-computer interaction, motion analysis, augmented reality, and virtual reality.
Recent deep learning-based solutions have achieved high performance in human pose estimation.
arXiv Detail & Related papers (2020-12-24T18:49:06Z) - Ego2Hands: A Dataset for Egocentric Two-hand Segmentation and Detection [1.0742675209112622]
We present Ego2Hands, a large-scale RGB-based egocentric hand segmentation/detection dataset that is semi-automatically annotated.
For quantitative analysis, we manually annotated an evaluation set that significantly exceeds existing benchmarks in quantity, diversity and annotation accuracy.
arXiv Detail & Related papers (2020-11-14T10:12:35Z) - DRG: Dual Relation Graph for Human-Object Interaction Detection [65.50707710054141]
We tackle the challenging problem of human-object interaction (HOI) detection.
Existing methods either recognize the interaction of each human-object pair in isolation or perform joint inference based on complex appearance-based features.
In this paper, we leverage an abstract spatial-semantic representation to describe each human-object pair and aggregate the contextual information of the scene via a dual relation graph.
arXiv Detail & Related papers (2020-08-26T17:59:40Z) - On the Composition and Limitations of Publicly Available COVID-19 X-Ray
Imaging Datasets [0.0]
Data scarcity, mismatch between training and target population, group imbalance, and lack of documentation are important sources of bias.
This paper presents an overview of the currently public available COVID-19 chest X-ray datasets.
arXiv Detail & Related papers (2020-08-26T14:16:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.