Monocular Expressive Body Regression through Body-Driven Attention
- URL: http://arxiv.org/abs/2008.09062v1
- Date: Thu, 20 Aug 2020 16:33:47 GMT
- Title: Monocular Expressive Body Regression through Body-Driven Attention
- Authors: Vasileios Choutas, Georgios Pavlakos, Timo Bolkart, Dimitrios Tzionas,
Michael J. Black
- Abstract summary: We introduce ExPose, which regresses the body, face, and hands, in SMPL-X format, from an RGB image.
hands and faces are much smaller than the body, occupying very few image pixels.
We observe that body estimation localizes the face and hands reasonably well.
- Score: 68.63766976089842
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To understand how people look, interact, or perform tasks, we need to quickly
and accurately capture their 3D body, face, and hands together from an RGB
image. Most existing methods focus only on parts of the body. A few recent
approaches reconstruct full expressive 3D humans from images using 3D body
models that include the face and hands. These methods are optimization-based
and thus slow, prone to local optima, and require 2D keypoints as input. We
address these limitations by introducing ExPose (EXpressive POse and Shape
rEgression), which directly regresses the body, face, and hands, in SMPL-X
format, from an RGB image. This is a hard problem due to the high
dimensionality of the body and the lack of expressive training data.
Additionally, hands and faces are much smaller than the body, occupying very
few image pixels. This makes hand and face estimation hard when body images are
downscaled for neural networks. We make three main contributions. First, we
account for the lack of training data by curating a dataset of SMPL-X fits on
in-the-wild images. Second, we observe that body estimation localizes the face
and hands reasonably well. We introduce body-driven attention for face and hand
regions in the original image to extract higher-resolution crops that are fed
to dedicated refinement modules. Third, these modules exploit part-specific
knowledge from existing face- and hand-only datasets. ExPose estimates
expressive 3D humans more accurately than existing optimization methods at a
small fraction of the computational cost. Our data, model and code are
available for research at https://expose.is.tue.mpg.de .
Related papers
- Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot [22.848563931757962]
We present Multi-HMR, a strong sigle-shot model for multi-person 3D human mesh recovery from a single RGB image.
Predictions encompass the whole body, including hands and facial expressions, using the SMPL-X parametric model.
We show that incorporating it into the training data further enhances predictions, particularly for hands.
arXiv Detail & Related papers (2024-02-22T16:05:13Z) - ECON: Explicit Clothed humans Optimized via Normal integration [54.51948104460489]
We present ECON, a method for creating 3D humans in loose clothes.
It infers detailed 2D maps for the front and back side of a clothed person.
It "inpaints" the missing geometry between d-BiNI surfaces.
arXiv Detail & Related papers (2022-12-14T18:59:19Z) - Learning Visibility for Robust Dense Human Body Estimation [78.37389398573882]
Estimating 3D human pose and shape from 2D images is a crucial yet challenging task.
We learn dense human body estimation that is robust to partial observations.
We obtain pseudo ground-truths of visibility labels from dense UV correspondences and train a neural network to predict visibility along with 3D coordinates.
arXiv Detail & Related papers (2022-08-23T00:01:05Z) - Accurate 3D Body Shape Regression using Metric and Semantic Attributes [55.58629009876271]
We show that 3D body shape regression from images can be trained from easy-to-obtain anthropometric measurements and linguistic shape attributes.
This is the first demonstration that 3D body shape regression from images can be trained from easy-to-obtain anthropometric measurements and linguistic shape attributes.
arXiv Detail & Related papers (2022-06-14T17:54:49Z) - Single-view 3D Body and Cloth Reconstruction under Complex Poses [37.86174829271747]
We extend existing implicit function-based models to deal with images of humans with arbitrary poses and self-occluded limbs.
We learn an implicit function that maps the input image to a 3D body shape with a low level of detail.
We then learn a displacement map, conditioned on the smoothed surface, which encodes the high-frequency details of the clothes and body.
arXiv Detail & Related papers (2022-05-09T07:34:06Z) - Collaborative Regression of Expressive Bodies using Moderation [54.730550151409474]
Methods that estimate 3D bodies, faces, or hands have progressed significantly, yet separately.
We introduce PIXIE, which produces animatable, whole-body 3D avatars from a single image.
We label training images as male, female, or non-binary, and train PIXIE to infer "gendered" 3D body shapes with a novel shape loss.
arXiv Detail & Related papers (2021-05-11T18:55:59Z) - Real-time RGBD-based Extended Body Pose Estimation [57.61868412206493]
We present a system for real-time RGBD-based estimation of 3D human pose.
We use parametric 3D deformable human mesh model (SMPL-X) as a representation.
We train estimators of body pose and facial expression parameters.
arXiv Detail & Related papers (2021-03-05T13:37:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.