FDA-GAN: Flow-based Dual Attention GAN for Human Pose Transfer
- URL: http://arxiv.org/abs/2112.00281v1
- Date: Wed, 1 Dec 2021 05:10:37 GMT
- Title: FDA-GAN: Flow-based Dual Attention GAN for Human Pose Transfer
- Authors: Liyuan Ma, Kejie Huang, Dongxu Wei, Zhaoyan Ming, Haibin Shen
- Abstract summary: We propose a Flow-based Dual Attention GAN (FDA-GAN) to apply occlusion- and deformation-aware feature fusion for higher generation quality.
To maintain the pose and global position consistency in transferring, we design a pose normalization network for learning adaptive normalization from the target pose to the source person.
Both qualitative and quantitative results show that our method outperforms state-of-the-art models in public iPER and DeepFashion datasets.
- Score: 3.08426078422188
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human pose transfer aims at transferring the appearance of the source person
to the target pose. Existing methods utilizing flow-based warping for non-rigid
human image generation have achieved great success. However, they fail to
preserve the appearance details in synthesized images since the spatial
correlation between the source and target is not fully exploited. To this end,
we propose the Flow-based Dual Attention GAN (FDA-GAN) to apply occlusion- and
deformation-aware feature fusion for higher generation quality. Specifically,
deformable local attention and flow similarity attention, constituting the dual
attention mechanism, can derive the output features responsible for deformable-
and occlusion-aware fusion, respectively. Besides, to maintain the pose and
global position consistency in transferring, we design a pose normalization
network for learning adaptive normalization from the target pose to the source
person. Both qualitative and quantitative results show that our method
outperforms state-of-the-art models in public iPER and DeepFashion datasets.
Related papers
- Consistent Human Image and Video Generation with Spatially Conditioned Diffusion [82.4097906779699]
Consistent human-centric image and video synthesis aims to generate images with new poses while preserving appearance consistency with a given reference image.
We frame the task as a spatially-conditioned inpainting problem, where the target image is in-painted to maintain appearance consistency with the reference.
This approach enables the reference features to guide the generation of pose-compliant targets within a unified denoising network.
arXiv Detail & Related papers (2024-12-19T05:02:30Z) - Learning Flow Fields in Attention for Controllable Person Image Generation [59.10843756343987]
Controllable person image generation aims to generate a person image conditioned on reference images.
We propose learning flow fields in attention (Leffa), which explicitly guides the target query to attend to the correct reference key.
Leffa achieves state-of-the-art performance in controlling appearance (virtual try-on) and pose (pose transfer), significantly reducing fine-grained detail distortion.
arXiv Detail & Related papers (2024-12-11T15:51:14Z) - Fusion Embedding for Pose-Guided Person Image Synthesis with Diffusion Model [2.7708222692419735]
Pose-Guided Person Image Synthesis (PGPIS) aims to synthesize high-quality person images corresponding to target poses.
Most approaches involve extracting representations of the target pose and source image.
We propose Fusion embedding for PGPIS using a Diffusion Model (FPDM)
arXiv Detail & Related papers (2024-12-10T09:25:01Z) - Advancing Pose-Guided Image Synthesis with Progressive Conditional Diffusion Models [13.019535928387702]
This paper presents Progressive Conditional Diffusion Models (PCDMs) that incrementally bridge the gap between person images under the target and source poses through three stages.
Both qualitative and quantitative results demonstrate the consistency and photorealism of our proposed PCDMs under challenging scenarios.
arXiv Detail & Related papers (2023-10-10T05:13:17Z) - Towards Hard-pose Virtual Try-on via 3D-aware Global Correspondence
Learning [70.75369367311897]
3D-aware global correspondences are reliable flows that jointly encode global semantic correlations, local deformations, and geometric priors of 3D human bodies.
An adversarial generator takes the garment warped by the 3D-aware flow, and the image of the target person as inputs, to synthesize the photo-realistic try-on result.
arXiv Detail & Related papers (2022-11-25T12:16:21Z) - Human Pose Transfer with Augmented Disentangled Feature Consistency [28.744108771350078]
We propose a pose transfer network with augmented Disentangled Feature Consistency (DFC-Net) to facilitate human pose transfer.
Given a pair of images containing the source and target person, DFC-Net extracts pose and static information from the source and target respectively, then synthesizes an image of the target person with the desired pose from the source.
arXiv Detail & Related papers (2021-07-23T01:25:07Z) - Transformer-Based Source-Free Domain Adaptation [134.67078085569017]
We study the task of source-free domain adaptation (SFDA), where the source data are not available during target adaptation.
We propose a generic and effective framework based on Transformer, named TransDA, for learning a generalized model for SFDA.
arXiv Detail & Related papers (2021-05-28T23:06:26Z) - Progressive and Aligned Pose Attention Transfer for Person Image
Generation [59.87492938953545]
This paper proposes a new generative adversarial network for pose transfer, i.e., transferring the pose of a given person to a target pose.
We use two types of blocks, namely Pose-Attentional Transfer Block (PATB) and Aligned Pose-Attentional Transfer Bloc (APATB)
We verify the efficacy of the model on the Market-1501 and DeepFashion datasets, using quantitative and qualitative measures.
arXiv Detail & Related papers (2021-03-22T07:24:57Z) - Structure-aware Person Image Generation with Pose Decomposition and
Semantic Correlation [29.727033198797518]
We propose a structure-aware flow based method for high-quality person image generation.
We decompose the human body into different semantic parts and apply different networks to predict the flow fields for these parts separately.
Our method can generate high-quality results under large pose discrepancy and outperforms state-of-the-art methods in both qualitative and quantitative comparisons.
arXiv Detail & Related papers (2021-02-05T03:07:57Z) - PoNA: Pose-guided Non-local Attention for Human Pose Transfer [105.14398322129024]
We propose a new human pose transfer method using a generative adversarial network (GAN) with simplified cascaded blocks.
Our model generates sharper and more realistic images with rich details, while having fewer parameters and faster speed.
arXiv Detail & Related papers (2020-12-13T12:38:29Z) - Liquid Warping GAN with Attention: A Unified Framework for Human Image
Synthesis [58.05389586712485]
We tackle human image synthesis, including human motion imitation, appearance transfer, and novel view synthesis.
In this paper, we propose a 3D body mesh recovery module to disentangle the pose and shape.
We also build a new dataset, namely iPER dataset, for the evaluation of human motion imitation, appearance transfer, and novel view synthesis.
arXiv Detail & Related papers (2020-11-18T02:57:47Z) - Neural Pose Transfer by Spatially Adaptive Instance Normalization [73.04483812364127]
We propose the first neural pose transfer model that solves the pose transfer via the latest technique for image style transfer.
Our model does not require any correspondences between the source and target meshes.
Experiments show that the proposed model can effectively transfer deformation from source to target meshes, and has good generalization ability to deal with unseen identities or poses of meshes.
arXiv Detail & Related papers (2020-03-16T14:33:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.