Related papers: Album cover art image generation with Generative Adversarial Networks

Album cover art image generation with Generative Adversarial Networks

URL: http://arxiv.org/abs/2212.04844v1
Date: Fri, 9 Dec 2022 13:27:46 GMT
Title: Album cover art image generation with Generative Adversarial Networks
Authors: Felipe Perez Stoppa, Ester Vida\~na-Vila, Joan Navarro
Abstract summary: This dissertation covers the basics of neural networks and works its way up to the particular aspects of GANs. The intention is to see if state of the art GANs can generate album art covers and if it is possible to tailor them by genre.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative Adversarial Networks (GANs) were introduced by Goodfellow in 2014, and since then have become popular for constructing generative artificial intelligence models. However, the drawbacks of such networks are numerous, like their longer training times, their sensitivity to hyperparameter tuning, several types of loss and optimization functions and other difficulties like mode collapse. Current applications of GANs include generating photo-realistic human faces, animals and objects. However, I wanted to explore the artistic ability of GANs in more detail, by using existing models and learning from them. This dissertation covers the basics of neural networks and works its way up to the particular aspects of GANs, together with experimentation and modification of existing available models, from least complex to most. The intention is to see if state of the art GANs (specifically StyleGAN2) can generate album art covers and if it is possible to tailor them by genre. This was attempted by first familiarizing myself with 3 existing GANs architectures, including the state of the art StyleGAN2. The StyleGAN2 code was used to train a model with a dataset containing 80K album cover images, then used to style images by picking curated images and mixing their styles.

Related papers

Style-Extracting Diffusion Models for Semi-Supervised Histopathology Segmentation [6.479933058008389]
Style-Extracting Diffusion Models generate images with unseen characteristics beneficial for downstream tasks. In this work, we show the capability of our method on a natural image dataset as a proof-of-concept. We verify the added value of the generated images by showing improved segmentation results and lower performance variability between patients.
arXiv Detail & Related papers (2024-03-21T14:36:59Z)
Diffusion idea exploration for art generation [0.10152838128195467]
Diffusion models have recently outperformed other generative models in image generation tasks using cross modal data as guiding information. The initial experiments for this task of novel image generation demonstrated promising qualitative results.
arXiv Detail & Related papers (2023-07-11T02:35:26Z)
3DAvatarGAN: Bridging Domains for Personalized Editable Avatars [75.31960120109106]
3D-GANs synthesize geometry and texture by training on large-scale datasets with a consistent structure. We propose an adaptation framework, where the source domain is a pre-trained 3D-GAN, while the target domain is a 2D-GAN trained on artistic datasets. We show a deformation-based technique for modeling exaggerated geometry of artistic domains, enabling -- as a byproduct -- personalized geometric editing.
arXiv Detail & Related papers (2023-01-06T19:58:47Z)
Implementing and Experimenting with Diffusion Models for Text-to-Image Generation [0.0]
Two models, DALL-E 2 and Imagen, have demonstrated that highly photorealistic images could be generated from a simple textual description of an image. Text-to-image models require exceptionally large amounts of computational resources to train, as well as handling huge datasets collected from the internet. This thesis contributes by reviewing the different approaches and techniques used by these models, and then by proposing our own implementation of a text-to-image model.
arXiv Detail & Related papers (2022-09-22T12:03:33Z)
3DMM-RF: Convolutional Radiance Fields for 3D Face Modeling [111.98096975078158]
We introduce a style-based generative network that synthesizes in one pass all and only the required rendering samples of a neural radiance field. We show that this model can accurately be fit to "in-the-wild" facial images of arbitrary pose and illumination, extract the facial characteristics, and be used to re-render the face in controllable conditions.
arXiv Detail & Related papers (2022-09-15T15:28:45Z)
A Survey on Leveraging Pre-trained Generative Adversarial Networks for Image Editing and Restoration [72.17890189820665]
Generative adversarial networks (GANs) have drawn enormous attention due to the simple yet effective training mechanism and superior image generation quality. Recent GAN models have greatly narrowed the gaps between the generated images and the real ones. Many recent works show emerging interest to take advantage of pre-trained GAN models by exploiting the well-disentangled latent space and the learned GAN priors.
arXiv Detail & Related papers (2022-07-21T05:05:58Z)
Weakly Supervised High-Fidelity Clothing Model Generation [67.32235668920192]
We propose a cheap yet scalable weakly-supervised method called Deep Generative Projection (DGP) to address this specific scenario. We show that projecting the rough alignment of clothing and body onto the StyleGAN space can yield photo-realistic wearing results.
arXiv Detail & Related papers (2021-12-14T07:15:15Z)
InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model. This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z)
MobileStyleGAN: A Lightweight Convolutional Neural Network for High-Fidelity Image Synthesis [0.0]
We focus on the performance optimization of style-based generative models. We introduce MobileStyleGAN architecture, which has x3.5 fewer parameters and is x9.5 less computationally complex than StyleGAN2.
arXiv Detail & Related papers (2021-04-10T13:46:49Z)
Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs [57.90008929377144]
We show that state-of-the-art GAN models can be used for a range of applications beyond unconditional image generation. We achieve this by an iterative scheme that also allows gaining control over the image generation process.
arXiv Detail & Related papers (2020-11-28T11:07:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.