Related papers: MonoClothCap: Towards Temporally Coherent Clothing Capture from Monocular RGB Video

MonoClothCap: Towards Temporally Coherent Clothing Capture from Monocular RGB Video

URL: http://arxiv.org/abs/2009.10711v2
Date: Mon, 23 Nov 2020 16:23:04 GMT
Title: MonoClothCap: Towards Temporally Coherent Clothing Capture from Monocular RGB Video
Authors: Donglai Xiang, Fabian Prada, Chenglei Wu, Jessica Hodgins
Abstract summary: We present a method to capture temporally coherent dynamic clothing deformation from a monocular RGB video input. We build statistical deformation models for three types of clothing: T-shirt, short pants and long pants. Our method produces temporally coherent reconstruction of body and clothing from monocular video.
Score: 10.679773937444445
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a method to capture temporally coherent dynamic clothing deformation from a monocular RGB video input. In contrast to the existing literature, our method does not require a pre-scanned personalized mesh template, and thus can be applied to in-the-wild videos. To constrain the output to a valid deformation space, we build statistical deformation models for three types of clothing: T-shirt, short pants and long pants. A differentiable renderer is utilized to align our captured shapes to the input frames by minimizing the difference in both silhouette, segmentation, and texture. We develop a UV texture growing method which expands the visible texture region of the clothing sequentially in order to minimize drift in deformation tracking. We also extract fine-grained wrinkle detail from the input videos by fitting the clothed surface to the normal maps estimated by a convolutional neural network. Our method produces temporally coherent reconstruction of body and clothing from monocular video. We demonstrate successful clothing capture results from a variety of challenging videos. Extensive quantitative experiments demonstrate the effectiveness of our method on metrics including body pose error and surface reconstruction error of the clothing.

Related papers

3DV-TON: Textured 3D-Guided Consistent Video Try-on via Diffusion Models [12.949009540192389]
3DV-TON is a novel framework for generating high-fidelity and temporally consistent video try-on results. Our approach employs generated animatable textured 3D meshes as explicit frame-level guidance. To advance video try-on research, we introduce HR-VVT, a high-resolution benchmark dataset containing 130 videos with diverse clothing types and scenarios.
arXiv Detail & Related papers (2025-04-24T10:12:40Z)
Shape-Guided Clothing Warping for Virtual Try-On [6.750870148213539]
Image-based virtual try-on aims to seamlessly fit in-shop clothing to a person image. We propose a novel shape-guided clothing warping method for virtual try-on, dubbed SCW-VTON.
arXiv Detail & Related papers (2025-04-21T17:08:36Z)
DressRecon: Freeform 4D Human Reconstruction from Monocular Video [64.61230035671885]
We present a method to reconstruct time-consistent human body models from monocular videos. We focus on extremely loose clothing or handheld object interactions. DressRecon yields higher-fidelity 3D reconstructions than prior art.
arXiv Detail & Related papers (2024-09-30T17:59:15Z)
ReLoo: Reconstructing Humans Dressed in Loose Garments from Monocular Video in the Wild [33.7726643918619]
ReLoo reconstructs high-quality 3D models of humans dressed in loose garments from monocular in-the-wild videos. We first establish a layered neural human representation that decomposes clothed humans into a neural inner body and outer clothing. A global optimization jointly optimize the shape, appearance, and deformations of the human body and clothing via multi-layer differentiable volume rendering.
arXiv Detail & Related papers (2024-09-23T17:58:39Z)
WildVidFit: Video Virtual Try-On in the Wild via Image-Based Controlled Diffusion Models [132.77237314239025]
Video virtual try-on aims to generate realistic sequences that maintain garment identity and adapt to a person's pose and body shape in source videos. Traditional image-based methods, relying on warping and blending, struggle with complex human movements and occlusions. We reconceptualize video try-on as a process of generating videos conditioned on garment descriptions and human motion. Our solution, WildVidFit, employs image-based controlled diffusion models for a streamlined, one-stage approach.
arXiv Detail & Related papers (2024-07-15T11:21:03Z)
VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation [79.99551055245071]
We propose VividPose, an end-to-end pipeline that ensures superior temporal stability. An identity-aware appearance controller integrates additional facial information without compromising other appearance details. A geometry-aware pose controller utilizes both dense rendering maps from SMPL-X and sparse skeleton maps. VividPose exhibits superior generalization capabilities on our proposed in-the-wild dataset.
arXiv Detail & Related papers (2024-05-28T13:18:32Z)
AniDress: Animatable Loose-Dressed Avatar from Sparse Views Using Garment Rigging Model [58.035758145894846]
We introduce AniDress, a novel method for generating animatable human avatars in loose clothes using very sparse multi-view videos. A pose-driven deformable neural radiance field conditioned on both body and garment motions is introduced, providing explicit control of both parts. Our method is able to render natural garment dynamics that deviate highly from the body and well to generalize to both unseen views and poses.
arXiv Detail & Related papers (2024-01-27T08:48:18Z)
High-Quality Animatable Dynamic Garment Reconstruction from Monocular Videos [51.8323369577494]
We propose the first method to recover high-quality animatable dynamic garments from monocular videos without depending on scanned data. To generate reasonable deformations for various unseen poses, we propose a learnable garment deformation network. We show that our method can reconstruct high-quality dynamic garments with coherent surface details, which can be easily animated under unseen poses.
arXiv Detail & Related papers (2023-11-02T13:16:27Z)
REC-MV: REconstructing 3D Dynamic Cloth from Monocular Videos [23.25620556096607]
Reconstructing dynamic 3D garment surfaces with open boundaries from monocular videos is an important problem. We introduce a novel approach, called REC-MV, to jointly optimize the explicit feature curves and the implicit signed distance field. Our approach outperforms existing methods and can produce high-quality dynamic garment surfaces.
arXiv Detail & Related papers (2023-05-23T16:53:10Z)
PERGAMO: Personalized 3D Garments from Monocular Video [6.8338761008826445]
PERGAMO is a data-driven approach to learn a deformable model for 3D garments from monocular images. We first introduce a novel method to reconstruct the 3D geometry of garments from a single image, and use it to build a dataset of clothing from monocular videos. We show that our method is capable of producing garment animations that match the real-world behaviour, and generalizes to unseen body motions extracted from motion capture dataset.
arXiv Detail & Related papers (2022-10-26T21:15:54Z)
MulayCap: Multi-layer Human Performance Capture Using A Monocular Video Camera [68.51530260071914]
We introduce MulayCap, a novel human performance capture method using a monocular video camera without the need for pre-scanning. The method uses "multi-layer" representations for geometry reconstruction and texture rendering, respectively. MulayCap can be applied to various important editing applications, such as cloth editing, re-targeting, relighting, and AR applications.
arXiv Detail & Related papers (2020-04-13T08:13:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.