Synthesizing Anyone, Anywhere, in Any Pose
- URL: http://arxiv.org/abs/2304.03164v2
- Date: Sun, 5 Nov 2023 12:09:54 GMT
- Title: Synthesizing Anyone, Anywhere, in Any Pose
- Authors: H{\aa}kon Hukkel{\aa}s, Frank Lindseth
- Abstract summary: TriA-GAN is a keypoint-guided GAN that can synthesize Anyone, Anywhere, in Any given pose.
We show that TriA-GAN significantly improves over previous in-the-wild full-body synthesis methods.
We also show that the latent space of TriA-GAN is compatible with standard unconditional editing techniques.
- Score: 0.7252027234425334
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We address the task of in-the-wild human figure synthesis, where the primary
goal is to synthesize a full body given any region in any image. In-the-wild
human figure synthesis has long been a challenging and under-explored task,
where current methods struggle to handle extreme poses, occluding objects, and
complex backgrounds.
Our main contribution is TriA-GAN, a keypoint-guided GAN that can synthesize
Anyone, Anywhere, in Any given pose. Key to our method is projected GANs
combined with a well-crafted training strategy, where our simple generator
architecture can successfully handle the challenges of in-the-wild full-body
synthesis. We show that TriA-GAN significantly improves over previous
in-the-wild full-body synthesis methods, all while requiring less conditional
information for synthesis (keypoints \vs DensePose). Finally, we show that the
latent space of TriA-GAN is compatible with standard unconditional editing
techniques, enabling text-guided editing of generated human figures.
Related papers
- GAS: Generative Avatar Synthesis from a Single Image [54.95198111659466]
We introduce a generalizable and unified framework to synthesize view-consistent and temporally coherent avatars from a single image.
Our approach bridges this gap by combining the reconstruction power of regression-based 3D human reconstruction with the generative capabilities of a diffusion model.
arXiv Detail & Related papers (2025-02-10T19:00:39Z) - Towards Affordance-Aware Articulation Synthesis for Rigged Objects [82.08199697616917]
A3Syn synthesizes articulation parameters for arbitrary and open-domain rigged objects obtained from the Internet.
A3Syn has stable convergence, completes in minutes, and synthesizes plausible affordance on different combinations of in-the-wild object rigs and scenes.
arXiv Detail & Related papers (2025-01-21T18:59:59Z) - CFSynthesis: Controllable and Free-view 3D Human Video Synthesis [57.561237409603066]
CFSynthesis is a novel framework for generating high-quality human videos with customizable attributes.
Our method leverages a texture-SMPL-based representation to ensure consistent and stable character appearances across free viewpoints.
Results on multiple datasets show that CFSynthesis achieves state-of-the-art performance in complex human animations.
arXiv Detail & Related papers (2024-12-15T05:57:36Z) - Survey on Controlable Image Synthesis with Deep Learning [15.29961293132048]
We present a survey of some recent works on 3D controllable image synthesis using deep learning.
We first introduce the datasets and evaluation indicators for 3D controllable image synthesis.
The photometrically controllable image synthesis approaches are also reviewed for 3D re-lighting researches.
arXiv Detail & Related papers (2023-07-18T07:02:51Z) - Novel View Synthesis of Humans using Differentiable Rendering [50.57718384229912]
We present a new approach for synthesizing novel views of people in new poses.
Our synthesis makes use of diffuse Gaussian primitives that represent the underlying skeletal structure of a human.
Rendering these primitives gives results in a high-dimensional latent image, which is then transformed into an RGB image by a decoder network.
arXiv Detail & Related papers (2023-03-28T10:48:33Z) - InsetGAN for Full-Body Image Generation [90.71033704904629]
We propose a novel method to combine multiple pretrained GANs.
One GAN generates a global canvas (e.g., human body) and a set of specialized GANs, or insets, focus on different parts.
We demonstrate the setup by combining a full body GAN with a dedicated high-quality face GAN to produce plausible-looking humans.
arXiv Detail & Related papers (2022-03-14T17:01:46Z) - Human View Synthesis using a Single Sparse RGB-D Input [16.764379184593256]
We present a novel view synthesis framework to generate realistic renders from unseen views of any human captured from a single-view sensor with sparse RGB-D.
An enhancer network leverages the overall fidelity, even in occluded areas from the original view, producing crisp renders with fine details.
arXiv Detail & Related papers (2021-12-27T20:13:53Z) - Hierarchy Composition GAN for High-fidelity Image Synthesis [57.32311953820988]
This paper presents an innovative Hierarchical Composition GAN (HIC-GAN)
HIC-GAN incorporates image synthesis in geometry and appearance domains into an end-to-end trainable network.
Experiments on scene text image synthesis, portrait editing and indoor rendering tasks show that the proposed HIC-GAN achieves superior synthesis performance qualitatively and quantitatively.
arXiv Detail & Related papers (2019-05-12T11:11:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.