Multi-view Human Pose and Shape Estimation Using Learnable Volumetric
Aggregation
- URL: http://arxiv.org/abs/2011.13427v1
- Date: Thu, 26 Nov 2020 18:33:35 GMT
- Title: Multi-view Human Pose and Shape Estimation Using Learnable Volumetric
Aggregation
- Authors: Soyong Shin, Eni Halilaj
- Abstract summary: We propose a learnable aggregation approach to reconstruct 3D human body pose and shape from calibrated multi-view images.
Compared to previous approaches, our framework shows higher accuracy and greater promise for real-time prediction, given its cost efficiency.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human pose and shape estimation from RGB images is a highly sought after
alternative to marker-based motion capture, which is laborious, requires
expensive equipment, and constrains capture to laboratory environments.
Monocular vision-based algorithms, however, still suffer from rotational
ambiguities and are not ready for translation in healthcare applications, where
high accuracy is paramount. While fusion of data from multiple viewpoints could
overcome these challenges, current algorithms require further improvement to
obtain clinically acceptable accuracies. In this paper, we propose a learnable
volumetric aggregation approach to reconstruct 3D human body pose and shape
from calibrated multi-view images. We use a parametric representation of the
human body, which makes our approach directly applicable to medical
applications. Compared to previous approaches, our framework shows higher
accuracy and greater promise for real-time prediction, given its cost
efficiency.
Related papers
- Enhancing Quantitative Image Synthesis through Pretraining and Resolution Scaling for Bone Mineral Density Estimation from a Plain X-ray Image [7.832005676209272]
This research aims to improve quantitative image synthesis (QIS) by exploring pretraining and image resolution scaling.
We propose a benchmark for evaluating pretraining performance using the task of QIS-based bone mineral density estimation from plain X-ray images.
arXiv Detail & Related papers (2024-07-30T01:39:30Z) - Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - ShaRPy: Shape Reconstruction and Hand Pose Estimation from RGB-D with
Uncertainty [6.559796851992517]
We propose ShaRPy, the first RGB-D Shape Reconstruction and hand Pose tracking system.
ShaRPy approximates a personalized hand shape, promoting a more realistic and intuitive understanding of its digital twin.
We evaluate ShaRPy on a keypoint detection benchmark and show qualitative results of hand function assessments for activity monitoring of musculoskeletal diseases.
arXiv Detail & Related papers (2023-03-17T15:12:25Z) - Localizing Scan Targets from Human Pose for Autonomous Lung Ultrasound
Imaging [61.60067283680348]
With the advent of COVID-19 global pandemic, there is a need to fully automate ultrasound imaging.
We propose a vision-based, data driven method that incorporates learning-based computer vision techniques.
Our method attains an accuracy level of 15.52 (9.47) mm for probe positioning and 4.32 (3.69)deg for probe orientation, with a success rate above 80% under an error threshold of 25mm for all scan targets.
arXiv Detail & Related papers (2022-12-15T14:34:12Z) - Direct Dense Pose Estimation [138.56533828316833]
Dense human pose estimation is the problem of learning dense correspondences between RGB images and the surfaces of human bodies.
Prior dense pose estimation methods are all based on Mask R-CNN framework and operate in a top-down manner of first attempting to identify a bounding box for each person.
We propose a novel alternative method for solving the dense pose estimation problem, called Direct Dense Pose (DDP)
arXiv Detail & Related papers (2022-04-04T06:14:38Z) - LatentHuman: Shape-and-Pose Disentangled Latent Representation for Human
Bodies [78.17425779503047]
We propose a novel neural implicit representation for the human body.
It is fully differentiable and optimizable with disentangled shape and pose latent spaces.
Our model can be trained and fine-tuned directly on non-watertight raw data with well-designed losses.
arXiv Detail & Related papers (2021-11-30T04:10:57Z) - Learning stochastic object models from medical imaging measurements by
use of advanced AmbientGANs [7.987904193401004]
generative adversarial networks (GANs) hold potential for such tasks.
Deep generative neural networks, such as generative adversarial networks (GANs) hold potential for such tasks.
In this work, a modified AmbientGAN training strategy is proposed that is suitable for modern progressive or multi-resolution training approaches.
arXiv Detail & Related papers (2021-06-27T21:46:23Z) - 3D Human Body Reshaping with Anthropometric Modeling [59.51820187982793]
Reshaping accurate and realistic 3D human bodies from anthropometric parameters poses a fundamental challenge for person identification, online shopping and virtual reality.
Existing approaches for creating such 3D shapes often suffer from complex measurement by range cameras or high-end scanners.
This paper proposes a novel feature-selection-based local mapping technique, which enables automatic anthropometric parameter modeling for each body facet.
arXiv Detail & Related papers (2021-04-05T04:09:39Z) - Adversarial Semantic Data Augmentation for Human Pose Estimation [96.75411357541438]
We propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity.
We also propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration.
State-of-the-art results are achieved on challenging benchmarks.
arXiv Detail & Related papers (2020-08-03T07:56:04Z) - Learning stochastic object models from medical imaging measurements
using Progressively-Growing AmbientGANs [14.501812971529137]
An important source of variability that can significantly limit observer performance is variation in the objects to-be-imaged.
It is desirable to establish SOMs from experimental imaging measurements acquired by use of a well-characterized imaging system.
Deep generative neural networks, such as generative adversarial networks (GANs) hold great potential for this task.
arXiv Detail & Related papers (2020-05-29T18:45:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.