Simultaneously-Collected Multimodal Lying Pose Dataset: Towards In-Bed
Human Pose Monitoring under Adverse Vision Conditions
- URL: http://arxiv.org/abs/2008.08735v1
- Date: Thu, 20 Aug 2020 02:20:35 GMT
- Title: Simultaneously-Collected Multimodal Lying Pose Dataset: Towards In-Bed
Human Pose Monitoring under Adverse Vision Conditions
- Authors: Shuangjun Liu, Xiaofei Huang, Nihang Fu, Cheng Li, Zhongnan Su, and
Sarah Ostadabbas
- Abstract summary: In-bed human pose estimation has significant values in many healthcare applications.
In this paper, we introduce our Simultaneously-collected multimodal Lying Pose dataset.
We show that state-of-the-art 2D pose estimation models can be trained effectively with SLP data with promising performance as high as 95% at PCKh@0.5 on a single modality.
- Score: 15.12849597272402
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Computer vision (CV) has achieved great success in interpreting semantic
meanings from images, yet CV algorithms can be brittle for tasks with adverse
vision conditions and the ones suffering from data/label pair limitation. One
of this tasks is in-bed human pose estimation, which has significant values in
many healthcare applications. In-bed pose monitoring in natural settings could
involve complete darkness or full occlusion. Furthermore, the lack of publicly
available in-bed pose datasets hinders the use of many successful pose
estimation algorithms for this task. In this paper, we introduce our
Simultaneously-collected multimodal Lying Pose (SLP) dataset, which includes
in-bed pose images from 109 participants captured using multiple imaging
modalities including RGB, long wave infrared, depth, and pressure map. We also
present a physical hyper parameter tuning strategy for ground truth pose label
generation under extreme conditions such as lights off and being fully covered
by a sheet/blanket. SLP design is compatible with the mainstream human pose
datasets, therefore, the state-of-the-art 2D pose estimation models can be
trained effectively with SLP data with promising performance as high as 95% at
PCKh@0.5 on a single modality. The pose estimation performance can be further
improved by including additional modalities through collaboration.
Related papers
- LWIRPOSE: A novel LWIR Thermal Image Dataset and Benchmark [9.679771580702258]
This dataset comprises over 2,400 high-quality LWIR (thermal) images.
Each image is meticulously annotated with 2D human poses, offering a valuable resource for researchers and practitioners.
We benchmark state-of-the-art pose estimation methods on the dataset to showcase its potential.
arXiv Detail & Related papers (2024-04-16T01:49:35Z) - LiCamPose: Combining Multi-View LiDAR and RGB Cameras for Robust Single-frame 3D Human Pose Estimation [31.651300414497822]
LiCamPose is a pipeline that integrates multi-view RGB and sparse point cloud information to estimate robust 3D human poses via single frame.
LiCamPose is evaluated on four datasets, including two public datasets, one synthetic dataset, and one challenging self-collected dataset.
arXiv Detail & Related papers (2023-12-11T14:30:11Z) - UniHPE: Towards Unified Human Pose Estimation via Contrastive Learning [29.037799937729687]
2D and 3D Human Pose Estimation (HPE) are two critical perceptual tasks in computer vision.
We propose UniHPE, a unified Human Pose Estimation pipeline, which aligns features from all three modalities.
Our proposed method holds immense potential to advance the field of computer vision and contribute to various applications.
arXiv Detail & Related papers (2023-11-24T21:55:34Z) - Under the Cover Infant Pose Estimation using Multimodal Data [0.0]
We present a novel dataset, Simultaneously-collected multimodal Mannequin Lying pose (SMaL) dataset, for under the cover infant pose estimation.
We successfully infer full body pose under the cover by training state-of-art pose estimation methods.
Our best performing model was able to detect joints under the cover within 25mm 86% of the time with an overall mean error of 16.9mm.
arXiv Detail & Related papers (2022-10-03T00:34:45Z) - Towards Accurate Cross-Domain In-Bed Human Pose Estimation [3.685548851716087]
Long-wavelength infrared (LWIR) modality based pose estimation algorithms overcome the aforementioned challenges.
We propose a novel learning strategy comprises of two-fold data augmentation to reduce the cross-domain discrepancy.
Our experiments and analysis show the effectiveness of our approach over multiple standard human pose estimation baselines.
arXiv Detail & Related papers (2021-10-07T15:54:46Z) - Learning Dynamics via Graph Neural Networks for Human Pose Estimation
and Tracking [98.91894395941766]
We propose a novel online approach to learning the pose dynamics, which are independent of pose detections in current fame.
Specifically, we derive this prediction of dynamics through a graph neural network(GNN) that explicitly accounts for both spatial-temporal and visual information.
Experiments on PoseTrack 2017 and PoseTrack 2018 datasets demonstrate that the proposed method achieves results superior to the state of the art on both human pose estimation and tracking tasks.
arXiv Detail & Related papers (2021-06-07T16:36:50Z) - Deep Bingham Networks: Dealing with Uncertainty and Ambiguity in Pose
Estimation [74.76155168705975]
Deep Bingham Networks (DBN) can handle pose-related uncertainties and ambiguities arising in almost all real life applications concerning 3D data.
DBN extends the state of the art direct pose regression networks by (i) a multi-hypotheses prediction head which can yield different distribution modes.
We propose new training strategies so as to avoid mode or posterior collapse during training and to improve numerical stability.
arXiv Detail & Related papers (2020-12-20T19:20:26Z) - AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in
the Wild [77.43884383743872]
We present AdaFuse, an adaptive multiview fusion method to enhance the features in occluded views.
We extensively evaluate the approach on three public datasets including Human3.6M, Total Capture and CMU Panoptic.
We also create a large scale synthetic dataset Occlusion-Person, which allows us to perform numerical evaluation on the occluded joints.
arXiv Detail & Related papers (2020-10-26T03:19:46Z) - Kinematic-Structure-Preserved Representation for Unsupervised 3D Human
Pose Estimation [58.72192168935338]
Generalizability of human pose estimation models developed using supervision on large-scale in-studio datasets remains questionable.
We propose a novel kinematic-structure-preserved unsupervised 3D pose estimation framework, which is not restrained by any paired or unpaired weak supervisions.
Our proposed model employs three consecutive differentiable transformations named as forward-kinematics, camera-projection and spatial-map transformation.
arXiv Detail & Related papers (2020-06-24T23:56:33Z) - Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image
Synthesis [72.34794624243281]
We propose a self-supervised learning framework to disentangle variations from unlabeled video frames.
Our differentiable formalization, bridging the representation gap between the 3D pose and spatial part maps, allows us to operate on videos with diverse camera movements.
arXiv Detail & Related papers (2020-04-09T07:55:01Z) - Weakly-Supervised 3D Human Pose Learning via Multi-view Images in the
Wild [101.70320427145388]
We propose a weakly-supervised approach that does not require 3D annotations and learns to estimate 3D poses from unlabeled multi-view data.
We evaluate our proposed approach on two large scale datasets.
arXiv Detail & Related papers (2020-03-17T08:47:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.