ZFlow: Gated Appearance Flow-based Virtual Try-on with 3D Priors
- URL: http://arxiv.org/abs/2109.07001v1
- Date: Tue, 14 Sep 2021 22:41:14 GMT
- Title: ZFlow: Gated Appearance Flow-based Virtual Try-on with 3D Priors
- Authors: Ayush Chopra, Rishabh Jain, Mayur Hemani, Balaji Krishnamurthy
- Abstract summary: Image-based virtual try-on involves synthesizing convincing images of a model wearing a particular garment.
Recent methods involve a two stage process: i.) warping of the garment to align with the model ii.
The lack of geometric information about the model or the garment often results in improper rendering of granular details.
We propose ZFlow, an end-to-end framework, which seeks to alleviate these concerns.
- Score: 13.977100716044104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image-based virtual try-on involves synthesizing perceptually convincing
images of a model wearing a particular garment and has garnered significant
research interest due to its immense practical applicability. Recent methods
involve a two stage process: i) warping of the garment to align with the model
ii) texture fusion of the warped garment and target model to generate the
try-on output. Issues arise due to the non-rigid nature of garments and the
lack of geometric information about the model or the garment. It often results
in improper rendering of granular details. We propose ZFlow, an end-to-end
framework, which seeks to alleviate these concerns regarding geometric and
textural integrity (such as pose, depth-ordering, skin and neckline
reproduction) through a combination of gated aggregation of hierarchical flow
estimates termed Gated Appearance Flow, and dense structural priors at various
stage of the network. ZFlow achieves state-of-the-art results as observed
qualitatively, and on quantitative benchmarks of image quality (PSNR, SSIM, and
FID). The paper presents extensive comparisons with other existing solutions
including a detailed user study and ablation studies to gauge the effect of
each of our contributions on multiple datasets.
Related papers
- pAE: An Efficient Autoencoder Architecture for Modeling the Lateral Geniculate Nucleus by Integrating Feedforward and Feedback Streams in Human Visual System [0.716879432974126]
We introduce a deep convolutional model that closely approximates human visual information processing.
We aim to approximate the function for the lateral geniculate nucleus (LGN) area using a trained shallow convolutional model.
The pAE model achieves the final 99.26% prediction performance and demonstrates a notable improvement of around 28% over human results in the temporal mode.
arXiv Detail & Related papers (2024-09-20T16:33:01Z) - Multi-View Unsupervised Image Generation with Cross Attention Guidance [23.07929124170851]
This paper introduces a novel pipeline for unsupervised training of a pose-conditioned diffusion model on single-category datasets.
We identify object poses by clustering the dataset through comparing visibility and locations of specific object parts.
Our model, MIRAGE, surpasses prior work in novel view synthesis on real images.
arXiv Detail & Related papers (2023-12-07T14:55:13Z) - Advancing Pose-Guided Image Synthesis with Progressive Conditional
Diffusion Models [13.795706255966259]
This paper presents Progressive Conditional Diffusion Models (PCDMs) that incrementally bridge the gap between person images under the target and source poses through three stages.
Both qualitative and quantitative results demonstrate the consistency and photorealism of our proposed PCDMs under challenging scenarios.
arXiv Detail & Related papers (2023-10-10T05:13:17Z) - GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from
Multi-view Images [79.39247661907397]
We introduce an effective framework Generalizable Model-based Neural Radiance Fields to synthesize free-viewpoint images.
Specifically, we propose a geometry-guided attention mechanism to register the appearance code from multi-view 2D images to a geometry proxy.
arXiv Detail & Related papers (2023-03-24T03:32:02Z) - Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis.
Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z) - Unifying Flow, Stereo and Depth Estimation [121.54066319299261]
We present a unified formulation and model for three motion and 3D perception tasks.
We formulate all three tasks as a unified dense correspondence matching problem.
Our model naturally enables cross-task transfer since the model architecture and parameters are shared across tasks.
arXiv Detail & Related papers (2022-11-10T18:59:54Z) - Single Stage Virtual Try-on via Deformable Attention Flows [51.70606454288168]
Virtual try-on aims to generate a photo-realistic fitting result given an in-shop garment and a reference person image.
We develop a novel Deformable Attention Flow (DAFlow) which applies the deformable attention scheme to multi-flow estimation.
Our proposed method achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-07-19T10:01:31Z) - Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection
Consistency [114.02182755620784]
We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion and depth in a monocular camera setup without supervision.
Our framework is shown to outperform the state-of-the-art depth and motion estimation methods.
arXiv Detail & Related papers (2021-02-04T14:26:42Z) - Normalizing Flows with Multi-Scale Autoregressive Priors [131.895570212956]
We introduce channel-wise dependencies in their latent space through multi-scale autoregressive priors (mAR)
Our mAR prior for models with split coupling flow layers (mAR-SCF) can better capture dependencies in complex multimodal data.
We show that mAR-SCF allows for improved image generation quality, with gains in FID and Inception scores compared to state-of-the-art flow-based models.
arXiv Detail & Related papers (2020-04-08T09:07:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.