SwipeGANSpace: Swipe-to-Compare Image Generation via Efficient Latent Space Exploration
- URL: http://arxiv.org/abs/2404.19693v1
- Date: Tue, 30 Apr 2024 16:37:27 GMT
- Title: SwipeGANSpace: Swipe-to-Compare Image Generation via Efficient Latent Space Exploration
- Authors: Yuto Nakashima, Mingzhe Yang, Yukino Baba,
- Abstract summary: We propose a novel approach that uses simple user-swipe interactions to generate preferred images for users.
To effectively explore the latent space with only swipe interactions, we apply principal component analysis to the latent space of the StyleGAN.
We use a multi-armed bandit algorithm to decide the dimensions to explore, focusing on the preferences of the user.
- Score: 3.864321514889098
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating preferred images using generative adversarial networks (GANs) is challenging owing to the high-dimensional nature of latent space. In this study, we propose a novel approach that uses simple user-swipe interactions to generate preferred images for users. To effectively explore the latent space with only swipe interactions, we apply principal component analysis to the latent space of the StyleGAN, creating meaningful subspaces. We use a multi-armed bandit algorithm to decide the dimensions to explore, focusing on the preferences of the user. Experiments show that our method is more efficient in generating preferred images than the baseline methods. Furthermore, changes in preferred images during image generation or the display of entirely different image styles were observed to provide new inspirations, subsequently altering user preferences. This highlights the dynamic nature of user preferences, which our proposed approach recognizes and enhances.
Related papers
- Learning User Embeddings from Human Gaze for Personalised Saliency Prediction [12.361829928359136]
We present a novel method to extract user embeddings from pairs of natural images and corresponding saliency maps.
At the core of our method is a Siamese convolutional neural encoder that learns the user embeddings by contrasting the image and personal saliency map pairs of different users.
arXiv Detail & Related papers (2024-03-20T14:58:40Z) - Layered Rendering Diffusion Model for Zero-Shot Guided Image Synthesis [60.260724486834164]
This paper introduces innovative solutions to enhance spatial controllability in diffusion models reliant on text queries.
We present two key innovations: Vision Guidance and the Layered Rendering Diffusion framework.
We apply our method to three practical applications: bounding box-to-image, semantic mask-to-image and image editing.
arXiv Detail & Related papers (2023-11-30T10:36:19Z) - Manipulating Embeddings of Stable Diffusion Prompts [22.10069408287608]
We propose and analyze a new method to manipulate the embedding of a prompt instead of the prompt text.
Our methods are considered less tedious and that the resulting images are often preferred.
arXiv Detail & Related papers (2023-08-23T10:59:41Z) - FaIRCoP: Facial Image Retrieval using Contrastive Personalization [43.293482565385055]
Retrieving facial images from attributes plays a vital role in various systems such as face recognition and suspect identification.
Existing methods do so by comparing specific characteristics from the user's mental image against the suggested images.
We propose a method that uses the user's feedback to label images as either similar or dissimilar to the target image.
arXiv Detail & Related papers (2022-05-28T09:52:09Z) - Controllable Person Image Synthesis with Spatially-Adaptive Warped
Normalization [72.65828901909708]
Controllable person image generation aims to produce realistic human images with desirable attributes.
We introduce a novel Spatially-Adaptive Warped Normalization (SAWN), which integrates a learned flow-field to warp modulation parameters.
We propose a novel self-training part replacement strategy to refine the pretrained model for the texture-transfer task.
arXiv Detail & Related papers (2021-05-31T07:07:44Z) - StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery [71.1862388442953]
We develop a text-based interface for StyleGAN image manipulation.
We first introduce an optimization scheme that utilizes a CLIP-based loss to modify an input latent vector in response to a user-provided text prompt.
Next, we describe a latent mapper that infers a text-guided latent manipulation step for a given input image, allowing faster and more stable text-based manipulation.
arXiv Detail & Related papers (2021-03-31T17:51:25Z) - StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation [45.20783737095007]
We explore and analyze the latent style space of StyleGAN2, a state-of-the-art architecture for image generation.
StyleSpace is significantly more disentangled than the other intermediate latent spaces explored by previous works.
Our findings pave the way to semantically meaningful and well-disentangled image manipulations via simple and intuitive interfaces.
arXiv Detail & Related papers (2020-11-25T15:00:33Z) - Unsupervised Discovery of Disentangled Manifolds in GANs [74.24771216154105]
Interpretable generation process is beneficial to various image editing applications.
We propose a framework to discover interpretable directions in the latent space given arbitrary pre-trained generative adversarial networks.
arXiv Detail & Related papers (2020-11-24T02:18:08Z) - Style Intervention: How to Achieve Spatial Disentanglement with
Style-based Generators? [100.60938767993088]
We propose a lightweight optimization-based algorithm which could adapt to arbitrary input images and render natural translation effects under flexible objectives.
We verify the performance of the proposed framework in facial attribute editing on high-resolution images, where both photo-realism and consistency are required.
arXiv Detail & Related papers (2020-11-19T07:37:31Z) - Sequential Gallery for Interactive Visual Design Optimization [51.52002870143971]
We propose a novel user-in-the-loop optimization method that allows users to efficiently find an appropriate parameter set.
We also propose using a gallery-based interface that provides options in the two-dimensional subspace arranged in an adaptive grid view.
Our experiment with synthetic functions shows that our sequential plane search can find satisfactory solutions in fewer iterations than baselines.
arXiv Detail & Related papers (2020-05-08T15:24:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.