Related papers: Exploring the Impact of Hand Pose and Shadow on Hand-washing Action Recognition

Exploring the Impact of Hand Pose and Shadow on Hand-washing Action Recognition

URL: http://arxiv.org/abs/2407.09520v1
Date: Wed, 19 Jun 2024 21:49:12 GMT
Title: Exploring the Impact of Hand Pose and Shadow on Hand-washing Action Recognition
Authors: Shengtai Ju, Amy R. Reibman,
Abstract summary: In this paper, we investigate how pose and shadow impact a classifier's performance. We show these are heavily impacted by pose and shadow conditions. It is intriguing to observe model accuracy drop to almost zero with bigger changes in pose.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In the real world, camera-based application systems can face many challenges, including environmental factors and distribution shift. In this paper, we investigate how pose and shadow impact a classifier's performance, using the specific application of handwashing action recognition. To accomplish this, we generate synthetic data with desired variations to introduce controlled distribution shift. Using our synthetic dataset, we define a classifier's breakdown points to be where the system's performance starts to degrade sharply, and we show these are heavily impacted by pose and shadow conditions. In particular, heavier and larger shadows create earlier breakdown points. Also, it is intriguing to observe model accuracy drop to almost zero with bigger changes in pose. Moreover, we propose a simple mitigation strategy for pose-induced breakdown points by utilizing additional training data from non-canonical poses. Results show that the optimal choices of additional training poses are those with moderate deviations from the canonical poses with 50-60 degrees of rotation.

Related papers

Data Augmentation via Latent Diffusion for Saliency Prediction [67.88936624546076]
Saliency prediction models are constrained by the limited diversity and quantity of labeled data. We propose a novel data augmentation method for deep saliency prediction that edits natural images while preserving the complexity and variability of real-world scenes.
arXiv Detail & Related papers (2024-09-11T14:36:24Z)
Practical Exposure Correction: Great Truths Are Always Simple [65.82019845544869]
We establish a Practical Exposure Corrector (PEC) that assembles the characteristics of efficiency and performance. We introduce an exposure adversarial function as the key engine to fully extract valuable information from the observation. Our experiments fully reveal the superiority of our proposed PEC.
arXiv Detail & Related papers (2022-12-29T09:52:13Z)
PoseTrans: A Simple Yet Effective Pose Transformation Augmentation for Human Pose Estimation [40.50255017107963]
We propose Pose Transformation (PoseTrans) to create new training samples that have diverse poses. We also propose Pose Clustering Module (PCM) to measure the pose rarity and select the "rarest" poses to help balance the long-tailed distribution. Our method is efficient and simple to implement, which can be easily integrated into the training pipeline of existing pose estimation models.
arXiv Detail & Related papers (2022-08-16T14:03:01Z)
Category-Level Pose Retrieval with Contrastive Features Learnt with Occlusion Augmentation [31.73423009695285]
We propose an approach to category-level pose estimation using a contrastive loss with a dynamic margin and a continuous pose-label space. Our approach achieves state-of-the-art performance on PASCAL3D and OccludedPASCAL3D, as well as high-quality results on KITTI3D.
arXiv Detail & Related papers (2022-08-12T10:04:08Z)
Adversarial Motion Modelling helps Semi-supervised Hand Pose Estimation [116.07661813869196]
We propose to combine ideas from adversarial training and motion modelling to tap into unlabeled videos. We show that an adversarial leads to better properties of the hand pose estimator via semi-supervised training on unlabeled video sequences. The main advantage of our approach is that we can make use of unpaired videos and joint sequence data both of which are much easier to attain than paired training data.
arXiv Detail & Related papers (2021-06-10T17:50:19Z)
Learning Dynamics via Graph Neural Networks for Human Pose Estimation and Tracking [98.91894395941766]
We propose a novel online approach to learning the pose dynamics, which are independent of pose detections in current fame. Specifically, we derive this prediction of dynamics through a graph neural network(GNN) that explicitly accounts for both spatial-temporal and visual information. Experiments on PoseTrack 2017 and PoseTrack 2018 datasets demonstrate that the proposed method achieves results superior to the state of the art on both human pose estimation and tracking tasks.
arXiv Detail & Related papers (2021-06-07T16:36:50Z)
CharacterGAN: Few-Shot Keypoint Character Animation and Reposing [64.19520387536741]
We introduce CharacterGAN, a generative model that can be trained on only a few samples of a given character. Our model generates novel poses based on keypoint locations, which can be modified in real time while providing interactive feedback. We show that our approach outperforms recent baselines and creates realistic animations for diverse characters.
arXiv Detail & Related papers (2021-02-05T12:38:15Z)
Selective Spatio-Temporal Aggregation Based Pose Refinement System: Towards Understanding Human Activities in Real-World Videos [8.571131862820833]
State-of-the-art pose estimators struggle in obtaining high-quality 2D or 3D pose data due to truncation and low-resolution in real-world un-annotated videos. We propose a Selective Spatio-Temporal Aggregation mechanism, named SST-A, that refines and smooths the keypoint locations extracted by multiple expert pose estimators. We demonstrate that the skeleton data refined by our Pose-Refinement system (SSTA-PRS) is effective at boosting various existing action recognition models.
arXiv Detail & Related papers (2020-11-10T19:19:51Z)
Leveraging Photometric Consistency over Time for Sparsely Supervised Hand-Object Reconstruction [118.21363599332493]
We present a method to leverage photometric consistency across time when annotations are only available for a sparse subset of frames in a video. Our model is trained end-to-end on color images to jointly reconstruct hands and objects in 3D by inferring their poses. We achieve state-of-the-art results on 3D hand-object reconstruction benchmarks and demonstrate that our approach allows us to improve the pose estimation accuracy.
arXiv Detail & Related papers (2020-04-28T12:03:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.