Generative Modeling of Shape-Dependent Self-Contact Human Poses
- URL: http://arxiv.org/abs/2509.23393v1
- Date: Sat, 27 Sep 2025 16:26:38 GMT
- Title: Generative Modeling of Shape-Dependent Self-Contact Human Poses
- Authors: Takehiko Ohkawa, Jihyun Lee, Shunsuke Saito, Jason Saragih, Fabian Prado, Yichen Xu, Shoou-I Yu, Ryosuke Furuta, Yoichi Sato, Takaaki Shiratori,
- Abstract summary: Despite its relevance, existing self-contact datasets lack variety of self-contact poses and precise body shapes.<n>We introduce the first extensive self-contact dataset with precise body shape registration, Goliath-SC, consisting of 383K self-contact poses across 130 subjects.<n>We propose generative modeling of self-contact prior conditioned by body shape parameters, based on a body-part-wise latent diffusion with self-attention.
- Score: 48.30189394803952
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: One can hardly model self-contact of human poses without considering underlying body shapes. For example, the pose of rubbing a belly for a person with a low BMI leads to penetration of the hand into the belly for a person with a high BMI. Despite its relevance, existing self-contact datasets lack the variety of self-contact poses and precise body shapes, limiting conclusive analysis between self-contact poses and shapes. To address this, we begin by introducing the first extensive self-contact dataset with precise body shape registration, Goliath-SC, consisting of 383K self-contact poses across 130 subjects. Using this dataset, we propose generative modeling of self-contact prior conditioned by body shape parameters, based on a body-part-wise latent diffusion with self-attention. We further incorporate this prior into single-view human pose estimation while refining estimated poses to be in contact. Our experiments suggest that shape conditioning is vital to the successful modeling of self-contact pose distribution, hence improving single-view pose estimation in self-contact.
Related papers
- ATLAS: Decoupling Skeletal and Shape Parameters for Expressive Parametric Human Modeling [43.66748605071065]
We present ATLAS, a high-fidelity body model learned from 600k high-resolution scans captured using 240 synchronized cameras.<n>We explicitly decouple the shape and skeleton bases by grounding our mesh representation in the human skeleton.<n> ATLAS outperforms existing methods by fitting unseen subjects in diverse poses more accurately, and quantitative evaluations show that our non-linear pose correctives more effectively capture complex poses compared to linear models.
arXiv Detail & Related papers (2025-08-21T17:58:56Z) - DPoser-X: Diffusion Model as Robust 3D Whole-body Human Pose Prior [82.9526308672547]
We present DPoser-X, a diffusion-based prior model for 3D whole-body human poses.<n>Our approach unifies various pose-centric tasks as inverse problems, solving them through variational diffusion sampling.<n>Our model consistently outperforms state-of-the-art alternatives, establishing a new benchmark for whole-body human pose prior modeling.
arXiv Detail & Related papers (2025-08-01T12:56:39Z) - Whole-Body Conditioned Egocentric Video Prediction [98.94980209293776]
We train models to Predict Ego-centric Video from human Actions (PEVA)<n>By conditioning on kinematic pose trajectories, structured by the joint hierarchy of the body, our model learns to simulate how physical human actions shape the environment from a first-person point of view.<n>Our work represents an initial attempt to tackle the challenges of modeling complex real-world environments and embodied agent behaviors with video prediction from the perspective of a human.
arXiv Detail & Related papers (2025-06-26T17:59:59Z) - Leveraging Anthropometric Measurements to Improve Human Mesh Estimation and Ensure Consistent Body Shapes [12.932412290302258]
We find that SOTA 3D human pose estimation (HPE) models outperform HME models regarding the precision of the estimated 3D keypoint positions.<n>We create a model called A2B that converts given anthropometric measurements to basic body shape parameters of human mesh models.
arXiv Detail & Related papers (2024-09-26T09:30:37Z) - Pose Priors from Language Models [74.61186408764559]
Language is often used to describe physical interaction, yet most 3D human pose estimation methods overlook this rich source of information.<n>We bridge this gap by leveraging large multimodal models (LMMs) as priors for reconstructing contact poses.
arXiv Detail & Related papers (2024-05-06T17:59:36Z) - Neural-ABC: Neural Parametric Models for Articulated Body with Clothes [29.04941764336255]
We introduce Neural-ABC, a novel model that can represent clothed human bodies with disentangled latent spaces for identity, clothing, shape, and pose.
Our model excels at disentangling clothing and identity in different shape and poses while preserving the style of the clothing.
Compared to other state-of-the-art parametric models, Neural-ABC demonstrates powerful advantages in the reconstruction of clothed human bodies.
arXiv Detail & Related papers (2024-04-06T16:29:10Z) - ALiSNet: Accurate and Lightweight Human Segmentation Network for Fashion
E-Commerce [57.876602177247534]
Smartphones provide a convenient way for users to capture images of their body.
We create a new segmentation model by simplifying Semantic FPN with PointRend.
We finetune this model on a high-quality dataset of humans in a restricted set of poses relevant for our application.
arXiv Detail & Related papers (2023-04-15T11:06:32Z) - Single-view 3D Body and Cloth Reconstruction under Complex Poses [37.86174829271747]
We extend existing implicit function-based models to deal with images of humans with arbitrary poses and self-occluded limbs.
We learn an implicit function that maps the input image to a 3D body shape with a low level of detail.
We then learn a displacement map, conditioned on the smoothed surface, which encodes the high-frequency details of the clothes and body.
arXiv Detail & Related papers (2022-05-09T07:34:06Z) - Imposing Temporal Consistency on Deep Monocular Body Shape and Pose
Estimation [67.23327074124855]
This paper presents an elegant solution for the integration of temporal constraints in the fitting process.
We derive parameters of a sequence of body models, representing shape and motion of a person, including jaw poses, facial expressions, and finger poses.
Our approach enables the derivation of realistic 3D body models from image sequences, including facial expression and articulated hands.
arXiv Detail & Related papers (2022-02-07T11:11:55Z) - Estimating Egocentric 3D Human Pose in the Wild with External Weak
Supervision [72.36132924512299]
We present a new egocentric pose estimation method, which can be trained on a large-scale in-the-wild egocentric dataset.
We propose a novel learning strategy to supervise the egocentric features with the high-quality features extracted by a pretrained external-view pose estimation model.
Experiments show that our method predicts accurate 3D poses from a single in-the-wild egocentric image and outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2022-01-20T00:45:13Z) - Learning Complex 3D Human Self-Contact [33.83748199524761]
Existing 3d reconstruction methods do not focus on body regions in self-contact.
We develop a model for Self-Contact Prediction that estimates the body surface signature of self-contact.
We show how more expressive 3d reconstructions can be recovered under self-contact signature constraints.
arXiv Detail & Related papers (2020-12-18T17:09:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.