RMGN: A Regional Mask Guided Network for Parser-free Virtual Try-on
- URL: http://arxiv.org/abs/2204.11258v1
- Date: Sun, 24 Apr 2022 12:30:13 GMT
- Title: RMGN: A Regional Mask Guided Network for Parser-free Virtual Try-on
- Authors: Chao Lin, Zhao Li, Sheng Zhou, Shichang Hu, Jialun Zhang, Linhao Luo,
Jiarun Zhang, Longtao Huang, Yuan He
- Abstract summary: VTON aims at fitting target clothes to reference person images, which is widely adopted in e-commerce.
Existing VTON approaches can be narrowly categorized into.
-Based(PB) and.
-Free(PF)
We propose a novel PF method named Regional Mask Guided Network(RMGN)
- Score: 23.198926150193472
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Virtual try-on(VTON) aims at fitting target clothes to reference person
images, which is widely adopted in e-commerce.Existing VTON approaches can be
narrowly categorized into Parser-Based(PB) and Parser-Free(PF) by whether
relying on the parser information to mask the persons' clothes and synthesize
try-on images. Although abandoning parser information has improved the
applicability of PF methods, the ability of detail synthesizing has also been
sacrificed. As a result, the distraction from original cloth may persistin
synthesized images, especially in complicated postures and high resolution
applications. To address the aforementioned issue, we propose a novel PF method
named Regional Mask Guided Network(RMGN). More specifically, a regional mask is
proposed to explicitly fuse the features of target clothes and reference
persons so that the persisted distraction can be eliminated. A posture
awareness loss and a multi-level feature extractor are further proposed to
handle the complicated postures and synthesize high resolution images.
Extensive experiments demonstrate that our proposed RMGN outperforms both
state-of-the-art PB and PF methods.Ablation studies further verify the
effectiveness ofmodules in RMGN.
Related papers
- ForgeryGPT: Multimodal Large Language Model For Explainable Image Forgery Detection and Localization [49.992614129625274]
ForgeryGPT is a novel framework that advances the Image Forgery Detection and localization task.
It captures high-order correlations of forged images from diverse linguistic feature spaces.
It enables explainable generation and interactive dialogue through a newly customized Large Language Model (LLM) architecture.
arXiv Detail & Related papers (2024-10-14T07:56:51Z) - MFCLIP: Multi-modal Fine-grained CLIP for Generalizable Diffusion Face Forgery Detection [64.29452783056253]
The rapid development of photo-realistic face generation methods has raised significant concerns in society and academia.
Although existing approaches mainly capture face forgery patterns using image modality, other modalities like fine-grained noises and texts are not fully explored.
We propose a novel multi-modal fine-grained CLIP (MFCLIP) model, which mines comprehensive and fine-grained forgery traces across image-noise modalities.
arXiv Detail & Related papers (2024-09-15T13:08:59Z) - HARIS: Human-Like Attention for Reference Image Segmentation [5.808325471170541]
We propose a referring image segmentation method called HARIS, which introduces the Human-Like Attention mechanism.
Our method achieves state-of-the-art performance and great zero-shot ability.
arXiv Detail & Related papers (2024-05-17T11:29:23Z) - Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis [65.7968515029306]
We propose a novel Coarse-to-Fine Latent Diffusion (CFLD) method for Pose-Guided Person Image Synthesis (PGPIS)
A perception-refined decoder is designed to progressively refine a set of learnable queries and extract semantic understanding of person images as a coarse-grained prompt.
arXiv Detail & Related papers (2024-02-28T06:07:07Z) - PFDM: Parser-Free Virtual Try-on via Diffusion Model [28.202996582963184]
We propose a free virtual try-on method based on the diffusion model (PFDM)
Given two images, PFDM can "wear" garments on the target person seamlessly by implicitly warping without any other information.
Experiments demonstrate that our proposed PFDM can successfully handle complex images, and outperform both state-of-the-art-free and high-fidelity-based models.
arXiv Detail & Related papers (2024-02-05T14:32:57Z) - Towards Effective Image Manipulation Detection with Proposal Contrastive
Learning [61.5469708038966]
We propose Proposal Contrastive Learning (PCL) for effective image manipulation detection.
Our PCL consists of a two-stream architecture by extracting two types of global features from RGB and noise views respectively.
Our PCL can be easily adapted to unlabeled data in practice, which can reduce manual labeling costs and promote more generalizable features.
arXiv Detail & Related papers (2022-10-16T13:30:13Z) - Dual Spoof Disentanglement Generation for Face Anti-spoofing with Depth
Uncertainty Learning [54.15303628138665]
Face anti-spoofing (FAS) plays a vital role in preventing face recognition systems from presentation attacks.
Existing face anti-spoofing datasets lack diversity due to the insufficient identity and insignificant variance.
We propose Dual Spoof Disentanglement Generation framework to tackle this challenge by "anti-spoofing via generation"
arXiv Detail & Related papers (2021-12-01T15:36:59Z) - TransRPPG: Remote Photoplethysmography Transformer for 3D Mask Face
Presentation Attack Detection [53.98866801690342]
3D mask face presentation attack detection (PAD) plays a vital role in securing face recognition systems from 3D mask attacks.
We propose a pure r transformer (TransR) framework for learning live intrinsicness representation efficiently.
Our TransR is lightweight and efficient (with only 547K parameters and 763MOPs) which is promising for mobile-level applications.
arXiv Detail & Related papers (2021-04-15T12:33:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.