Global Context-Aware Person Image Generation
- URL: http://arxiv.org/abs/2302.14728v1
- Date: Tue, 28 Feb 2023 16:34:55 GMT
- Title: Global Context-Aware Person Image Generation
- Authors: Prasun Roy, Saumik Bhattacharya, Subhankar Ghosh, Umapada Pal, Michael
Blumenstein
- Abstract summary: We propose a data-driven approach for context-aware person image generation.
In our method, the position, scale, and appearance of the generated person are semantically conditioned on the existing persons in the scene.
- Score: 24.317541784957285
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a data-driven approach for context-aware person image generation.
Specifically, we attempt to generate a person image such that the synthesized
instance can blend into a complex scene. In our method, the position, scale,
and appearance of the generated person are semantically conditioned on the
existing persons in the scene. The proposed technique is divided into three
sequential steps. At first, we employ a Pix2PixHD model to infer a coarse
semantic mask that represents the new person's spatial location, scale, and
potential pose. Next, we use a data-centric approach to select the closest
representation from a precomputed cluster of fine semantic masks. Finally, we
adopt a multi-scale, attention-guided architecture to transfer the appearance
attributes from an exemplar image. The proposed strategy enables us to
synthesize semantically coherent realistic persons that can blend into an
existing scene without altering the global context. We conclude our findings
with relevant qualitative and quantitative evaluations.
Related papers
- MagicFace: Training-free Universal-Style Human Image Customized Synthesis [13.944050414488911]
MagicFace is a training-free method for universal-style human image personalized synthesis.
It integrates reference concept features into their latent generated region at the pixel level.
Experiments demonstrate the superiority of MagicFace in both human-centric subject-to-image synthesis and multi-concept human image customization.
arXiv Detail & Related papers (2024-08-14T10:08:46Z) - StableSemantics: A Synthetic Language-Vision Dataset of Semantic Representations in Naturalistic Images [5.529078451095096]
understanding the semantics of visual scenes is a fundamental challenge in Computer Vision.
Recent advancements in text-to-image frameworks have led to models that implicitly capture natural scene statistics.
Our work presents StableSemantics, a dataset comprising 224 thousand human-curated prompts, processed natural language captions, over 2 million synthetic images, and 10 million attention maps corresponding to individual noun chunks.
arXiv Detail & Related papers (2024-06-19T17:59:40Z) - Synthesizing Environment-Specific People in Photographs [57.962139271004325]
ESP is a novel method for context-aware full-body generation.
ESP is conditioned on a 2D pose and contextual cues that are extracted from the photograph of the scene.
We show that ESP outperforms the state-of-the-art on the task of contextual full-body generation.
arXiv Detail & Related papers (2023-12-22T10:15:15Z) - Scene Aware Person Image Generation through Global Contextual
Conditioning [24.317541784957285]
We propose a novel pipeline to generate and insert contextually relevant person images into an existing scene.
More specifically, we aim to insert a person such that the location, pose, and scale of the person being inserted blends in with the existing persons in the scene.
arXiv Detail & Related papers (2022-06-06T16:18:15Z) - Fully Context-Aware Image Inpainting with a Learned Semantic Pyramid [102.24539566851809]
Restoring reasonable and realistic content for arbitrary missing regions in images is an important yet challenging task.
Recent image inpainting models have made significant progress in generating vivid visual details, but they can still lead to texture blurring or structural distortions.
We propose the Semantic Pyramid Network (SPN) motivated by the idea that learning multi-scale semantic priors can greatly benefit the recovery of locally missing content in images.
arXiv Detail & Related papers (2021-12-08T04:33:33Z) - Learned Spatial Representations for Few-shot Talking-Head Synthesis [68.3787368024951]
We propose a novel approach for few-shot talking-head synthesis.
We show that this disentangled representation leads to a significant improvement over previous methods.
arXiv Detail & Related papers (2021-04-29T17:59:42Z) - Generating Person Images with Appearance-aware Pose Stylizer [66.44220388377596]
We present a novel end-to-end framework to generate realistic person images based on given person poses and appearances.
The core of our framework is a novel generator called Appearance-aware Pose Stylizer (APS) which generates human images by coupling the target pose with the conditioned person appearance progressively.
arXiv Detail & Related papers (2020-07-17T15:58:05Z) - Wish You Were Here: Context-Aware Human Generation [100.51309746913512]
We present a novel method for inserting objects, specifically humans, into existing images.
Our method involves threeworks: the first generates the semantic map of the new person, given the pose of the other persons in the scene.
The second network renders the pixels of the novel person and its blending mask, based on specifications in the form of multiple appearance components.
A third network refines the generated face in order to match those of the target person.
arXiv Detail & Related papers (2020-05-21T14:09:14Z) - Example-Guided Image Synthesis across Arbitrary Scenes using Masked
Spatial-Channel Attention and Self-Supervision [83.33283892171562]
Example-guided image synthesis has recently been attempted to synthesize an image from a semantic label map and an exemplary image.
In this paper, we tackle a more challenging and general task, where the exemplar is an arbitrary scene image that is semantically different from the given label map.
We propose an end-to-end network for joint global and local feature alignment and synthesis.
arXiv Detail & Related papers (2020-04-18T18:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.