Synthesizing Anyone, Anywhere, in Any Pose
- URL: http://arxiv.org/abs/2304.03164v2
- Date: Sun, 5 Nov 2023 12:09:54 GMT
- Title: Synthesizing Anyone, Anywhere, in Any Pose
- Authors: H{\aa}kon Hukkel{\aa}s, Frank Lindseth
- Abstract summary: TriA-GAN is a keypoint-guided GAN that can synthesize Anyone, Anywhere, in Any given pose.
We show that TriA-GAN significantly improves over previous in-the-wild full-body synthesis methods.
We also show that the latent space of TriA-GAN is compatible with standard unconditional editing techniques.
- Score: 0.7252027234425334
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We address the task of in-the-wild human figure synthesis, where the primary
goal is to synthesize a full body given any region in any image. In-the-wild
human figure synthesis has long been a challenging and under-explored task,
where current methods struggle to handle extreme poses, occluding objects, and
complex backgrounds.
Our main contribution is TriA-GAN, a keypoint-guided GAN that can synthesize
Anyone, Anywhere, in Any given pose. Key to our method is projected GANs
combined with a well-crafted training strategy, where our simple generator
architecture can successfully handle the challenges of in-the-wild full-body
synthesis. We show that TriA-GAN significantly improves over previous
in-the-wild full-body synthesis methods, all while requiring less conditional
information for synthesis (keypoints \vs DensePose). Finally, we show that the
latent space of TriA-GAN is compatible with standard unconditional editing
techniques, enabling text-guided editing of generated human figures.
Related papers
- Abstract Art Interpretation Using ControlNet [0.0]
We empower users with finer control over the synthesis process, enabling enhanced manipulation of synthesized imagery.
Inspired by the minimalist forms found in abstract artworks, we introduce a novel condition crafted from geometric primitives such as triangles.
arXiv Detail & Related papers (2024-08-23T06:25:54Z) - Survey on Controlable Image Synthesis with Deep Learning [15.29961293132048]
We present a survey of some recent works on 3D controllable image synthesis using deep learning.
We first introduce the datasets and evaluation indicators for 3D controllable image synthesis.
The photometrically controllable image synthesis approaches are also reviewed for 3D re-lighting researches.
arXiv Detail & Related papers (2023-07-18T07:02:51Z) - Novel View Synthesis of Humans using Differentiable Rendering [50.57718384229912]
We present a new approach for synthesizing novel views of people in new poses.
Our synthesis makes use of diffuse Gaussian primitives that represent the underlying skeletal structure of a human.
Rendering these primitives gives results in a high-dimensional latent image, which is then transformed into an RGB image by a decoder network.
arXiv Detail & Related papers (2023-03-28T10:48:33Z) - Set-the-Scene: Global-Local Training for Generating Controllable NeRF
Scenes [68.14127205949073]
We propose a novel GlobalLocal training framework for synthesizing a 3D scene using object proxies.
We show that using proxies allows a wide variety of editing options, such as adjusting the placement of each independent object.
Our results show that Set-the-Scene offers a powerful solution for scene synthesis and manipulation.
arXiv Detail & Related papers (2023-03-23T17:17:29Z) - InsetGAN for Full-Body Image Generation [90.71033704904629]
We propose a novel method to combine multiple pretrained GANs.
One GAN generates a global canvas (e.g., human body) and a set of specialized GANs, or insets, focus on different parts.
We demonstrate the setup by combining a full body GAN with a dedicated high-quality face GAN to produce plausible-looking humans.
arXiv Detail & Related papers (2022-03-14T17:01:46Z) - Human View Synthesis using a Single Sparse RGB-D Input [16.764379184593256]
We present a novel view synthesis framework to generate realistic renders from unseen views of any human captured from a single-view sensor with sparse RGB-D.
An enhancer network leverages the overall fidelity, even in occluded areas from the original view, producing crisp renders with fine details.
arXiv Detail & Related papers (2021-12-27T20:13:53Z) - PISE: Person Image Synthesis and Editing with Decoupled GAN [64.70360318367943]
We propose PISE, a novel two-stage generative model for Person Image Synthesis and Editing.
For human pose transfer, we first synthesize a human parsing map aligned with the target pose to represent the shape of clothing.
To decouple the shape and style of clothing, we propose joint global and local per-region encoding and normalization.
arXiv Detail & Related papers (2021-03-06T04:32:06Z) - Semantic View Synthesis [56.47999473206778]
We tackle a new problem of semantic view synthesis -- generating free-viewpoint rendering of a synthesized scene using a semantic label map as input.
First, we focus on synthesizing the color and depth of the visible surface of the 3D scene.
We then use the synthesized color and depth to impose explicit constraints on the multiple-plane image (MPI) representation prediction process.
arXiv Detail & Related papers (2020-08-24T17:59:46Z) - Hierarchy Composition GAN for High-fidelity Image Synthesis [57.32311953820988]
This paper presents an innovative Hierarchical Composition GAN (HIC-GAN)
HIC-GAN incorporates image synthesis in geometry and appearance domains into an end-to-end trainable network.
Experiments on scene text image synthesis, portrait editing and indoor rendering tasks show that the proposed HIC-GAN achieves superior synthesis performance qualitatively and quantitatively.
arXiv Detail & Related papers (2019-05-12T11:11:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.