Towards Hard-pose Virtual Try-on via 3D-aware Global Correspondence
Learning
- URL: http://arxiv.org/abs/2211.14052v1
- Date: Fri, 25 Nov 2022 12:16:21 GMT
- Title: Towards Hard-pose Virtual Try-on via 3D-aware Global Correspondence
Learning
- Authors: Zaiyu Huang, Hanhui Li, Zhenyu Xie, Michael Kampffmeyer, Qingling Cai,
Xiaodan Liang
- Abstract summary: 3D-aware global correspondences are reliable flows that jointly encode global semantic correlations, local deformations, and geometric priors of 3D human bodies.
An adversarial generator takes the garment warped by the 3D-aware flow, and the image of the target person as inputs, to synthesize the photo-realistic try-on result.
- Score: 70.75369367311897
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we target image-based person-to-person virtual try-on in the
presence of diverse poses and large viewpoint variations. Existing methods are
restricted in this setting as they estimate garment warping flows mainly based
on 2D poses and appearance, which omits the geometric prior of the 3D human
body shape. Moreover, current garment warping methods are confined to localized
regions, which makes them ineffective in capturing long-range dependencies and
results in inferior flows with artifacts. To tackle these issues, we present
3D-aware global correspondences, which are reliable flows that jointly encode
global semantic correlations, local deformations, and geometric priors of 3D
human bodies. Particularly, given an image pair depicting the source and target
person, (a) we first obtain their pose-aware and high-level representations via
two encoders, and introduce a coarse-to-fine decoder with multiple refinement
modules to predict the pixel-wise global correspondence. (b) 3D parametric
human models inferred from images are incorporated as priors to regularize the
correspondence refinement process so that our flows can be 3D-aware and better
handle variations of pose and viewpoint. (c) Finally, an adversarial generator
takes the garment warped by the 3D-aware flow, and the image of the target
person as inputs, to synthesize the photo-realistic try-on result. Extensive
experiments on public benchmarks and our HardPose test set demonstrate the
superiority of our method against the SOTA try-on approaches.
Related papers
- FAMOUS: High-Fidelity Monocular 3D Human Digitization Using View Synthesis [51.193297565630886]
The challenge of accurately inferring texture remains, particularly in obscured areas such as the back of a person in frontal-view images.
This limitation in texture prediction largely stems from the scarcity of large-scale and diverse 3D datasets.
We propose leveraging extensive 2D fashion datasets to enhance both texture and shape prediction in 3D human digitization.
arXiv Detail & Related papers (2024-10-13T01:25:05Z) - UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues [55.69339788566899]
UPose3D is a novel approach for multi-view 3D human pose estimation.
It improves robustness and flexibility without requiring direct 3D annotations.
arXiv Detail & Related papers (2024-04-23T00:18:00Z) - Personalized 3D Human Pose and Shape Refinement [19.082329060985455]
regression-based methods have dominated the field of 3D human pose and shape estimation.
We propose to construct dense correspondences between initial human model estimates and the corresponding images.
We show that our approach not only consistently leads to better image-model alignment, but also to improved 3D accuracy.
arXiv Detail & Related papers (2024-03-18T10:13:53Z) - 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images.
We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image.
We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z) - PONet: Robust 3D Human Pose Estimation via Learning Orientations Only [116.1502793612437]
We propose a novel Pose Orientation Net (PONet) that is able to robustly estimate 3D pose by learning orientations only.
PONet estimates the 3D orientation of these limbs by taking advantage of the local image evidence to recover the 3D pose.
We evaluate our method on multiple datasets, including Human3.6M, MPII, MPI-INF-3DHP, and 3DPW.
arXiv Detail & Related papers (2021-12-21T12:48:48Z) - SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation [46.85865451812981]
We propose a novel system that first regresses a set of 2.5D representations of body parts and then reconstructs the 3D absolute poses based on these 2.5D representations with a depth-aware part association algorithm.
Such a single-shot bottom-up scheme allows the system to better learn and reason about the inter-person depth relationship, improving both 3D and 2D pose estimation.
arXiv Detail & Related papers (2020-08-26T09:56:07Z) - Learning 3D Human Shape and Pose from Dense Body Parts [117.46290013548533]
We propose a Decompose-and-aggregate Network (DaNet) to learn 3D human shape and pose from dense correspondences of body parts.
Messages from local streams are aggregated to enhance the robust prediction of the rotation-based poses.
Our method is validated on both indoor and real-world datasets including Human3.6M, UP3D, COCO, and 3DPW.
arXiv Detail & Related papers (2019-12-31T15:09:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.