Related papers: LatentKeypointGAN: Controlling Images via Latent Keypoints -- Extended Abstract

LatentKeypointGAN: Controlling Images via Latent Keypoints -- Extended Abstract

URL: http://arxiv.org/abs/2205.03448v1
Date: Fri, 6 May 2022 19:00:07 GMT
Title: LatentKeypointGAN: Controlling Images via Latent Keypoints -- Extended Abstract
Authors: Xingzhe He, Bastian Wandt, Helge Rhodin
Abstract summary: We introduce LatentKeypointGAN, a two-stage GAN conditioned on a set of keypoints and associated appearance embeddings. LatentKeypointGAN provides an interpretable latent space that can be used to re-arrange the generated images.
Score: 16.5436159805682
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative adversarial networks (GANs) can now generate photo-realistic images. However, how to best control the image content remains an open challenge. We introduce LatentKeypointGAN, a two-stage GAN internally conditioned on a set of keypoints and associated appearance embeddings providing control of the position and style of the generated objects and their respective parts. A major difficulty that we address is disentangling the image into spatial and appearance factors with little domain knowledge and supervision signals. We demonstrate in a user study and quantitative experiments that LatentKeypointGAN provides an interpretable latent space that can be used to re-arrange the generated images by re-positioning and exchanging keypoint embeddings, such as generating portraits by combining the eyes, and mouth from different images. Notably, our method does not require labels as it is self-supervised and thereby applies to diverse application domains, such as editing portraits, indoor rooms, and full-body human poses.

Related papers

Consistent Human Image and Video Generation with Spatially Conditioned Diffusion [82.4097906779699]
Consistent human-centric image and video synthesis aims to generate images with new poses while preserving appearance consistency with a given reference image. We frame the task as a spatially-conditioned inpainting problem, where the target image is in-painted to maintain appearance consistency with the reference. This approach enables the reference features to guide the generation of pose-compliant targets within a unified denoising network.
arXiv Detail & Related papers (2024-12-19T05:02:30Z)
Design and Identification of Keypoint Patches in Unstructured Environments [7.940068522906917]
Keypoint identification in an image allows direct mapping from raw images to 2D coordinates. We propose four simple yet distinct designs that consider various scale, rotation and camera projection. We customize the Superpoint network to ensure robust detection under various types of image degradation.
arXiv Detail & Related papers (2024-10-01T09:05:50Z)
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold [79.94300820221996]
DragGAN is a new way of controlling generative adversarial networks (GANs) DragGAN allows anyone to deform an image with precise control over where pixels go, thus manipulating the pose, shape, expression, and layout of diverse categories such as animals, cars, humans, landscapes, etc. Both qualitative and quantitative comparisons demonstrate the advantage of DragGAN over prior approaches in the tasks of image manipulation and point tracking.
arXiv Detail & Related papers (2023-05-18T13:41:25Z)
Spatial Steerability of GANs via Self-Supervision from Discriminator [123.27117057804732]
We propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space. Specifically, we design randomly sampled Gaussian heatmaps to be encoded into the intermediate layers of generative models as spatial inductive bias. During inference, users can interact with the spatial heatmaps in an intuitive manner, enabling them to edit the output image by adjusting the scene layout, moving, or removing objects.
arXiv Detail & Related papers (2023-01-20T07:36:29Z)
Weakly Supervised Keypoint Discovery [27.750244813890262]
We propose a method for keypoint discovery from a 2D image using image-level supervision. Motivated by the weakly-supervised learning approach, our method exploits image-level supervision to identify discriminative parts. Our approach achieves state-of-the-art performance for the task of keypoint estimation on the limited supervision scenarios.
arXiv Detail & Related papers (2021-09-28T01:26:53Z)
Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN [88.62422914645066]
We present an algorithm for re-rendering a person from a single image under arbitrary poses. Existing methods often have difficulties in hallucinating occluded contents photo-realistically while preserving the identity and fine details in the source image. We show that our method compares favorably against the state-of-the-art algorithms in both quantitative evaluation and visual comparison.
arXiv Detail & Related papers (2021-09-13T17:59:33Z)
Ensembling with Deep Generative Views [72.70801582346344]
generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose. Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification. We use StyleGAN2 as the source of generative augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars.
arXiv Detail & Related papers (2021-04-29T17:58:35Z)
LatentKeypointGAN: Controlling Images via Latent Keypoints [23.670795505376336]
We introduce LatentKeypointGAN, a two-stage GAN trained end-to-end on the classical GAN objective. LatentKeypointGAN provides an interpretable latent space that can be used to re-arrange the generated images. In addition, the explicit generation of keypoints and matching images enables a new, GAN-based method for unsupervised keypoint detection.
arXiv Detail & Related papers (2021-03-29T17:59:10Z)
Semi-supervised Keypoint Localization [12.37129078618206]
We propose to learn simultaneously keypoint heatmaps and pose invariant keypoint representations in a semi-supervised manner. Our approach significantly outperforms previous methods on several benchmarks for human and animal body landmark localization.
arXiv Detail & Related papers (2021-01-20T06:23:08Z)
PIE: Portrait Image Embedding for Semantic Control [82.69061225574774]
We present the first approach for embedding real portrait images in the latent space of StyleGAN. We use StyleRig, a pretrained neural network that maps the control space of a 3D morphable face model to the latent space of the GAN. An identity energy preservation term allows spatially coherent edits while maintaining facial integrity.
arXiv Detail & Related papers (2020-09-20T17:53:51Z)
InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs [73.27299786083424]
We propose a framework called InterFaceGAN to interpret the disentangled face representation learned by state-of-the-art GAN models. We first find that GANs learn various semantics in some linear subspaces of the latent space. We then conduct a detailed study on the correlation between different semantics and manage to better disentangle them via subspace projection.
arXiv Detail & Related papers (2020-05-18T18:01:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.