MV-TON: Memory-based Video Virtual Try-on network
- URL: http://arxiv.org/abs/2108.07502v1
- Date: Tue, 17 Aug 2021 08:35:23 GMT
- Title: MV-TON: Memory-based Video Virtual Try-on network
- Authors: Xiaojing Zhong, Zhonghua Wu, Taizhe Tan, Guosheng Lin, Qingyao Wu
- Abstract summary: We propose a Memory-based Video virtual Try-On Network (MV-TON)
MV-TON seamlessly transfers desired clothes to a target person without using any clothing templates and generates high-resolution realistic videos.
Experimental results show the effectiveness of our method in the video virtual try-on task and its superiority over other existing methods.
- Score: 49.496817042974456
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the development of Generative Adversarial Network, image-based virtual
try-on methods have made great progress. However, limited work has explored the
task of video-based virtual try-on while it is important in real-world
applications. Most existing video-based virtual try-on methods usually require
clothing templates and they can only generate blurred and low-resolution
results. To address these challenges, we propose a Memory-based Video virtual
Try-On Network (MV-TON), which seamlessly transfers desired clothes to a target
person without using any clothing templates and generates high-resolution
realistic videos. Specifically, MV-TON consists of two modules: 1) a try-on
module that transfers the desired clothes from model images to frame images by
pose alignment and region-wise replacing of pixels; 2) a memory refinement
module that learns to embed the existing generated frames into the latent space
as external memory for the following frame generation. Experimental results
show the effectiveness of our method in the video virtual try-on task and its
superiority over other existing methods.
Related papers
- Fashion-VDM: Video Diffusion Model for Virtual Try-On [17.284966713669927]
We present Fashion-VDM, a video diffusion model (VDM) for generating virtual try-on videos.
Given an input garment image and person video, our method aims to generate a high-quality try-on video of the person wearing the given garment.
arXiv Detail & Related papers (2024-10-31T21:52:33Z) - WildVidFit: Video Virtual Try-On in the Wild via Image-Based Controlled Diffusion Models [132.77237314239025]
Video virtual try-on aims to generate realistic sequences that maintain garment identity and adapt to a person's pose and body shape in source videos.
Traditional image-based methods, relying on warping and blending, struggle with complex human movements and occlusions.
We reconceptualize video try-on as a process of generating videos conditioned on garment descriptions and human motion.
Our solution, WildVidFit, employs image-based controlled diffusion models for a streamlined, one-stage approach.
arXiv Detail & Related papers (2024-07-15T11:21:03Z) - ViViD: Video Virtual Try-on using Diffusion Models [46.710863047471264]
Video virtual try-on aims to transfer a clothing item onto the video of a target person.
Previous video-based try-on solutions can only generate low visual quality and blurring results.
We present ViViD, a novel framework employing powerful diffusion models to tackle the task of video virtual try-on.
arXiv Detail & Related papers (2024-05-20T05:28:22Z) - MV-VTON: Multi-View Virtual Try-On with Diffusion Models [91.71150387151042]
The goal of image-based virtual try-on is to generate an image of the target person naturally wearing the given clothing.
Existing methods solely focus on the frontal try-on using the frontal clothing.
We introduce Multi-View Virtual Try-ON (MV-VTON), which aims to reconstruct the dressing results from multiple views using the given clothes.
arXiv Detail & Related papers (2024-04-26T12:27:57Z) - MoVideo: Motion-Aware Video Generation with Diffusion Models [97.03352319694795]
We propose a novel motion-aware generation (MoVideo) framework that takes motion into consideration from two aspects: video depth and optical flow.
MoVideo achieves state-of-the-art results in both text-to-video and image-to-video generation, showing promising prompt consistency, frame consistency and visual quality.
arXiv Detail & Related papers (2023-11-19T13:36:03Z) - Multi-object Video Generation from Single Frame Layouts [84.55806837855846]
We propose a video generative framework capable of synthesizing global scenes with local objects.
Our framework is a non-trivial adaptation from image generation methods, and is new to this field.
Our model has been evaluated on two widely-used video recognition benchmarks.
arXiv Detail & Related papers (2023-05-06T09:07:01Z) - ClothFormer:Taming Video Virtual Try-on in All Module [12.084652803378598]
Video virtual try-on aims to fit the target clothes to a person in the video with-temporal consistent results.
ClothFormer framework successfully synthesizes realistic, temporal consistent results in complicated environment.
arXiv Detail & Related papers (2022-04-26T08:40:28Z) - SieveNet: A Unified Framework for Robust Image-Based Virtual Try-On [14.198545992098309]
SieveNet is a framework for robust image-based virtual try-on.
We introduce a multi-stage coarse-to-fine warping network to better model fine-grained intricacies.
We also introduce a try-on cloth conditioned segmentation mask prior to improve the texture transfer network.
arXiv Detail & Related papers (2020-01-17T12:33:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.