GLocal: Global Graph Reasoning and Local Structure Transfer for Person
Image Generation
- URL: http://arxiv.org/abs/2112.00263v1
- Date: Wed, 1 Dec 2021 03:54:30 GMT
- Title: GLocal: Global Graph Reasoning and Local Structure Transfer for Person
Image Generation
- Authors: Liyuan Ma, Kejie Huang, Dongxu Wei, Haibin Shen
- Abstract summary: We focus on person image generation, namely, generating person image under various conditions, e.g., corrupted texture or different pose.
We present a GLocal framework to improve the occlusion-aware texture estimation by globally reasoning the style inter-correlations among different semantic regions.
For local structural information preservation, we further extract the local structure of the source image and regain it in the generated image via local structure transfer.
- Score: 2.580765958706854
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we focus on person image generation, namely, generating person
image under various conditions, e.g., corrupted texture or different pose. To
address texture occlusion and large pose misalignment in this task, previous
works just use the corresponding region's style to infer the occluded area and
rely on point-wise alignment to reorganize the context texture information,
lacking the ability to globally correlate the region-wise style codes and
preserve the local structure of the source. To tackle these problems, we
present a GLocal framework to improve the occlusion-aware texture estimation by
globally reasoning the style inter-correlations among different semantic
regions, which can also be employed to recover the corrupted images in texture
inpainting. For local structural information preservation, we further extract
the local structure of the source image and regain it in the generated image
via local structure transfer. We benchmark our method to fully characterize its
performance on DeepFashion dataset and present extensive ablation studies that
highlight the novelty of our method.
Related papers
- GeoCLIP: Clip-Inspired Alignment between Locations and Images for
Effective Worldwide Geo-localization [61.10806364001535]
Worldwide Geo-localization aims to pinpoint the precise location of images taken anywhere on Earth.
Existing approaches divide the globe into discrete geographic cells, transforming the problem into a classification task.
We propose GeoCLIP, a novel CLIP-inspired Image-to-GPS retrieval approach that enforces alignment between the image and its corresponding GPS locations.
arXiv Detail & Related papers (2023-09-27T20:54:56Z) - Learning with Multi-modal Gradient Attention for Explainable Composed
Image Retrieval [15.24270990274781]
We propose a new gradient-attention-based learning objective that explicitly forces the model to focus on the local regions of interest being modified in each retrieval step.
We show how MMGrad can be incorporated into an end-to-end model training strategy with a new learning objective that explicitly forces these MMGrad attention maps to highlight the correct local regions corresponding to the modifier text.
arXiv Detail & Related papers (2023-08-31T11:46:27Z) - Semantic Image Translation for Repairing the Texture Defects of Building
Models [16.764719266178655]
We introduce a novel approach for synthesizing faccade texture images that authentically reflect the architectural style from a structured label map.
Our proposed method is also capable of synthesizing texture images with specific styles for faccades that lack pre-existing textures.
arXiv Detail & Related papers (2023-03-30T14:38:53Z) - Arbitrary Style Transfer with Structure Enhancement by Combining the
Global and Local Loss [51.309905690367835]
We introduce a novel arbitrary style transfer method with structure enhancement by combining the global and local loss.
Experimental results demonstrate that our method can generate higher-quality images with impressive visual effects.
arXiv Detail & Related papers (2022-07-23T07:02:57Z) - Image Harmonization by Matching Regional References [10.249228010611617]
Recent image harmonization methods typically summarize the appearance pattern of global background and apply it to the global foreground without location discrepancy.
For a real image, the appearances (illumination, color temperature, saturation, hue, texture, etc) of different regions can vary significantly.
Previous methods, which transfer the appearance globally, are not optimal.
arXiv Detail & Related papers (2022-04-10T16:23:06Z) - Context-Aware Image Inpainting with Learned Semantic Priors [100.99543516733341]
We introduce pretext tasks that are semantically meaningful to estimating the missing contents.
We propose a context-aware image inpainting model, which adaptively integrates global semantics and local features.
arXiv Detail & Related papers (2021-06-14T08:09:43Z) - Controllable Person Image Synthesis with Spatially-Adaptive Warped
Normalization [72.65828901909708]
Controllable person image generation aims to produce realistic human images with desirable attributes.
We introduce a novel Spatially-Adaptive Warped Normalization (SAWN), which integrates a learned flow-field to warp modulation parameters.
We propose a novel self-training part replacement strategy to refine the pretrained model for the texture-transfer task.
arXiv Detail & Related papers (2021-05-31T07:07:44Z) - Guidance and Evaluation: Semantic-Aware Image Inpainting for Mixed
Scenes [54.836331922449666]
We propose a Semantic Guidance and Evaluation Network (SGE-Net) to update the structural priors and the inpainted image.
It utilizes semantic segmentation map as guidance in each scale of inpainting, under which location-dependent inferences are re-evaluated.
Experiments on real-world images of mixed scenes demonstrated the superiority of our proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2020-03-15T17:49:20Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.