Extending 3D body pose estimation for robotic-assistive therapies of
autistic children
- URL: http://arxiv.org/abs/2402.08006v1
- Date: Mon, 12 Feb 2024 19:11:03 GMT
- Title: Extending 3D body pose estimation for robotic-assistive therapies of
autistic children
- Authors: Laura Santos, Bernardo Carvalho, Catarina Barata, Jos\'e Santos-Victor
- Abstract summary: We develop a 3D pose estimator for children with Autism.
Our method has an error below $0.3m$, which is considered acceptable for this kind of application.
In real-world settings, the proposed model performs similarly to a Kinect depth camera.
- Score: 4.751886527142779
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Robotic-assistive therapy has demonstrated very encouraging results for
children with Autism. Accurate estimation of the child's pose is essential both
for human-robot interaction and for therapy assessment purposes. Non-intrusive
methods are the sole viable option since these children are sensitive to touch.
While depth cameras have been used extensively, existing methods face two
major limitations: (i) they are usually trained with adult-only data and do not
correctly estimate a child's pose, and (ii) they fail in scenarios with a high
number of occlusions. Therefore, our goal was to develop a 3D pose estimator
for children, by adapting an existing state-of-the-art 3D body modelling method
and incorporating a linear regression model to fine-tune one of its inputs,
thereby correcting the pose of children's 3D meshes.
In controlled settings, our method has an error below $0.3m$, which is
considered acceptable for this kind of application and lower than current
state-of-the-art methods. In real-world settings, the proposed model performs
similarly to a Kinect depth camera and manages to successfully estimate the 3D
body poses in a much higher number of frames.
Related papers
- Age-Inclusive 3D Human Mesh Recovery for Action-Preserving Data Anonymization [30.818455306299455]
AionHMR is a comprehensive framework designed to bridge the 3D shape and pose estimation domain gap.<n>We propose an optimization-based method that extends a top-performing model by incorporating the SMPL-A body model.<n>We then developed and trained a specialized transformer-based deep learning model capable of real-time 3D age-inclusive human reconstruction.
arXiv Detail & Related papers (2025-12-04T21:23:04Z) - Efficient Domain Adaptation via Generative Prior for 3D Infant Pose
Estimation [29.037799937729687]
3D human pose estimation has gained impressive development in recent years, but only a few works focus on infants, that have different bone lengths and also have limited data.
Here, we show that our model attains state-of-the-art MPJPE performance of 43.6 mm on the SyRIP dataset and 21.2 mm on the MINI-RGBD dataset.
We also prove that our method, ZeDO-i, could attain efficient domain adaptation, even if only a small number of data is given.
arXiv Detail & Related papers (2023-11-17T20:49:37Z) - PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and
Hallucination under Self-supervision [102.48681650013698]
Existing self-supervised 3D human pose estimation schemes have largely relied on weak supervisions to guide the learning.
We propose a novel self-supervised approach that allows us to explicitly generate 2D-3D pose pairs for augmenting supervision.
This is made possible via introducing a reinforcement-learning-based imitator, which is learned jointly with a pose estimator alongside a pose hallucinator.
arXiv Detail & Related papers (2022-03-29T14:45:53Z) - PONet: Robust 3D Human Pose Estimation via Learning Orientations Only [116.1502793612437]
We propose a novel Pose Orientation Net (PONet) that is able to robustly estimate 3D pose by learning orientations only.
PONet estimates the 3D orientation of these limbs by taking advantage of the local image evidence to recover the 3D pose.
We evaluate our method on multiple datasets, including Human3.6M, MPII, MPI-INF-3DHP, and 3DPW.
arXiv Detail & Related papers (2021-12-21T12:48:48Z) - AGORA: Avatars in Geography Optimized for Regression Analysis [35.22486186509372]
AGORA is a synthetic dataset with high realism and highly accurate ground truth.
We create reference 3D poses and body shapes by fitting the SMPL-X body model (with face and hands) to the 3D scans.
We evaluate existing state-of-the-art methods for 3D human pose estimation on this dataset and find that most methods perform poorly on images of children.
arXiv Detail & Related papers (2021-04-29T20:33:25Z) - Residual Pose: A Decoupled Approach for Depth-based 3D Human Pose
Estimation [18.103595280706593]
We leverage recent advances in reliable 2D pose estimation with CNN to estimate the 3D pose of people from depth images.
Our approach achieves very competitive results both in accuracy and speed on two public datasets.
arXiv Detail & Related papers (2020-11-10T10:08:13Z) - Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image.
The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images.
We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z) - Invariant Representation Learning for Infant Pose Estimation with Small
Data [14.91506452479778]
We release a hybrid synthetic and real infant pose dataset with small yet diverse real images as well as generated synthetic infant poses.
In our ablation study, with identical network structure, models trained on SyRIP dataset show noticeable improvement over the ones trained on the only other public infant pose datasets.
One of our best infant pose estimation performers on the state-of-the-art DarkPose model shows mean average precision (mAP) of 93.6.
arXiv Detail & Related papers (2020-10-13T01:10:14Z) - Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.
We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end.
Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z) - Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image
Synthesis [72.34794624243281]
We propose a self-supervised learning framework to disentangle variations from unlabeled video frames.
Our differentiable formalization, bridging the representation gap between the 3D pose and spatial part maps, allows us to operate on videos with diverse camera movements.
arXiv Detail & Related papers (2020-04-09T07:55:01Z) - Weakly-Supervised 3D Human Pose Learning via Multi-view Images in the
Wild [101.70320427145388]
We propose a weakly-supervised approach that does not require 3D annotations and learns to estimate 3D poses from unlabeled multi-view data.
We evaluate our proposed approach on two large scale datasets.
arXiv Detail & Related papers (2020-03-17T08:47:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.