MonoClothCap: Towards Temporally Coherent Clothing Capture from
Monocular RGB Video
- URL: http://arxiv.org/abs/2009.10711v2
- Date: Mon, 23 Nov 2020 16:23:04 GMT
- Title: MonoClothCap: Towards Temporally Coherent Clothing Capture from
Monocular RGB Video
- Authors: Donglai Xiang, Fabian Prada, Chenglei Wu, Jessica Hodgins
- Abstract summary: We present a method to capture temporally coherent dynamic clothing deformation from a monocular RGB video input.
We build statistical deformation models for three types of clothing: T-shirt, short pants and long pants.
Our method produces temporally coherent reconstruction of body and clothing from monocular video.
- Score: 10.679773937444445
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a method to capture temporally coherent dynamic clothing
deformation from a monocular RGB video input. In contrast to the existing
literature, our method does not require a pre-scanned personalized mesh
template, and thus can be applied to in-the-wild videos. To constrain the
output to a valid deformation space, we build statistical deformation models
for three types of clothing: T-shirt, short pants and long pants. A
differentiable renderer is utilized to align our captured shapes to the input
frames by minimizing the difference in both silhouette, segmentation, and
texture. We develop a UV texture growing method which expands the visible
texture region of the clothing sequentially in order to minimize drift in
deformation tracking. We also extract fine-grained wrinkle detail from the
input videos by fitting the clothed surface to the normal maps estimated by a
convolutional neural network. Our method produces temporally coherent
reconstruction of body and clothing from monocular video. We demonstrate
successful clothing capture results from a variety of challenging videos.
Extensive quantitative experiments demonstrate the effectiveness of our method
on metrics including body pose error and surface reconstruction error of the
clothing.
Related papers
- DressRecon: Freeform 4D Human Reconstruction from Monocular Video [64.61230035671885]
We present a method to reconstruct time-consistent human body models from monocular videos.
We focus on extremely loose clothing or handheld object interactions.
DressRecon yields higher-fidelity 3D reconstructions than prior art.
arXiv Detail & Related papers (2024-09-30T17:59:15Z) - ReLoo: Reconstructing Humans Dressed in Loose Garments from Monocular Video in the Wild [33.7726643918619]
ReLoo reconstructs high-quality 3D models of humans dressed in loose garments from monocular in-the-wild videos.
We first establish a layered neural human representation that decomposes clothed humans into a neural inner body and outer clothing.
A global optimization jointly optimize the shape, appearance, and deformations of the human body and clothing via multi-layer differentiable volume rendering.
arXiv Detail & Related papers (2024-09-23T17:58:39Z) - WildVidFit: Video Virtual Try-On in the Wild via Image-Based Controlled Diffusion Models [132.77237314239025]
Video virtual try-on aims to generate realistic sequences that maintain garment identity and adapt to a person's pose and body shape in source videos.
Traditional image-based methods, relying on warping and blending, struggle with complex human movements and occlusions.
We reconceptualize video try-on as a process of generating videos conditioned on garment descriptions and human motion.
Our solution, WildVidFit, employs image-based controlled diffusion models for a streamlined, one-stage approach.
arXiv Detail & Related papers (2024-07-15T11:21:03Z) - VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation [79.99551055245071]
We propose VividPose, an end-to-end pipeline that ensures superior temporal stability.
An identity-aware appearance controller integrates additional facial information without compromising other appearance details.
A geometry-aware pose controller utilizes both dense rendering maps from SMPL-X and sparse skeleton maps.
VividPose exhibits superior generalization capabilities on our proposed in-the-wild dataset.
arXiv Detail & Related papers (2024-05-28T13:18:32Z) - AniDress: Animatable Loose-Dressed Avatar from Sparse Views Using
Garment Rigging Model [58.035758145894846]
We introduce AniDress, a novel method for generating animatable human avatars in loose clothes using very sparse multi-view videos.
A pose-driven deformable neural radiance field conditioned on both body and garment motions is introduced, providing explicit control of both parts.
Our method is able to render natural garment dynamics that deviate highly from the body and well to generalize to both unseen views and poses.
arXiv Detail & Related papers (2024-01-27T08:48:18Z) - High-Quality Animatable Dynamic Garment Reconstruction from Monocular
Videos [51.8323369577494]
We propose the first method to recover high-quality animatable dynamic garments from monocular videos without depending on scanned data.
To generate reasonable deformations for various unseen poses, we propose a learnable garment deformation network.
We show that our method can reconstruct high-quality dynamic garments with coherent surface details, which can be easily animated under unseen poses.
arXiv Detail & Related papers (2023-11-02T13:16:27Z) - REC-MV: REconstructing 3D Dynamic Cloth from Monocular Videos [23.25620556096607]
Reconstructing dynamic 3D garment surfaces with open boundaries from monocular videos is an important problem.
We introduce a novel approach, called REC-MV, to jointly optimize the explicit feature curves and the implicit signed distance field.
Our approach outperforms existing methods and can produce high-quality dynamic garment surfaces.
arXiv Detail & Related papers (2023-05-23T16:53:10Z) - PERGAMO: Personalized 3D Garments from Monocular Video [6.8338761008826445]
PERGAMO is a data-driven approach to learn a deformable model for 3D garments from monocular images.
We first introduce a novel method to reconstruct the 3D geometry of garments from a single image, and use it to build a dataset of clothing from monocular videos.
We show that our method is capable of producing garment animations that match the real-world behaviour, and generalizes to unseen body motions extracted from motion capture dataset.
arXiv Detail & Related papers (2022-10-26T21:15:54Z) - MulayCap: Multi-layer Human Performance Capture Using A Monocular Video
Camera [68.51530260071914]
We introduce MulayCap, a novel human performance capture method using a monocular video camera without the need for pre-scanning.
The method uses "multi-layer" representations for geometry reconstruction and texture rendering, respectively.
MulayCap can be applied to various important editing applications, such as cloth editing, re-targeting, relighting, and AR applications.
arXiv Detail & Related papers (2020-04-13T08:13:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.