Album cover art image generation with Generative Adversarial Networks
- URL: http://arxiv.org/abs/2212.04844v1
- Date: Fri, 9 Dec 2022 13:27:46 GMT
- Title: Album cover art image generation with Generative Adversarial Networks
- Authors: Felipe Perez Stoppa, Ester Vida\~na-Vila, Joan Navarro
- Abstract summary: This dissertation covers the basics of neural networks and works its way up to the particular aspects of GANs.
The intention is to see if state of the art GANs can generate album art covers and if it is possible to tailor them by genre.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative Adversarial Networks (GANs) were introduced by Goodfellow in 2014,
and since then have become popular for constructing generative artificial
intelligence models. However, the drawbacks of such networks are numerous, like
their longer training times, their sensitivity to hyperparameter tuning,
several types of loss and optimization functions and other difficulties like
mode collapse. Current applications of GANs include generating photo-realistic
human faces, animals and objects. However, I wanted to explore the artistic
ability of GANs in more detail, by using existing models and learning from
them. This dissertation covers the basics of neural networks and works its way
up to the particular aspects of GANs, together with experimentation and
modification of existing available models, from least complex to most. The
intention is to see if state of the art GANs (specifically StyleGAN2) can
generate album art covers and if it is possible to tailor them by genre. This
was attempted by first familiarizing myself with 3 existing GANs architectures,
including the state of the art StyleGAN2. The StyleGAN2 code was used to train
a model with a dataset containing 80K album cover images, then used to style
images by picking curated images and mixing their styles.
Related papers
- Style-Extracting Diffusion Models for Semi-Supervised Histopathology Segmentation [6.479933058008389]
Style-Extracting Diffusion Models generate images with unseen characteristics beneficial for downstream tasks.
In this work, we show the capability of our method on a natural image dataset as a proof-of-concept.
We verify the added value of the generated images by showing improved segmentation results and lower performance variability between patients.
arXiv Detail & Related papers (2024-03-21T14:36:59Z) - Diffusion idea exploration for art generation [0.10152838128195467]
Diffusion models have recently outperformed other generative models in image generation tasks using cross modal data as guiding information.
The initial experiments for this task of novel image generation demonstrated promising qualitative results.
arXiv Detail & Related papers (2023-07-11T02:35:26Z) - 3DAvatarGAN: Bridging Domains for Personalized Editable Avatars [75.31960120109106]
3D-GANs synthesize geometry and texture by training on large-scale datasets with a consistent structure.
We propose an adaptation framework, where the source domain is a pre-trained 3D-GAN, while the target domain is a 2D-GAN trained on artistic datasets.
We show a deformation-based technique for modeling exaggerated geometry of artistic domains, enabling -- as a byproduct -- personalized geometric editing.
arXiv Detail & Related papers (2023-01-06T19:58:47Z) - Implementing and Experimenting with Diffusion Models for Text-to-Image
Generation [0.0]
Two models, DALL-E 2 and Imagen, have demonstrated that highly photorealistic images could be generated from a simple textual description of an image.
Text-to-image models require exceptionally large amounts of computational resources to train, as well as handling huge datasets collected from the internet.
This thesis contributes by reviewing the different approaches and techniques used by these models, and then by proposing our own implementation of a text-to-image model.
arXiv Detail & Related papers (2022-09-22T12:03:33Z) - 3DMM-RF: Convolutional Radiance Fields for 3D Face Modeling [111.98096975078158]
We introduce a style-based generative network that synthesizes in one pass all and only the required rendering samples of a neural radiance field.
We show that this model can accurately be fit to "in-the-wild" facial images of arbitrary pose and illumination, extract the facial characteristics, and be used to re-render the face in controllable conditions.
arXiv Detail & Related papers (2022-09-15T15:28:45Z) - A Survey on Leveraging Pre-trained Generative Adversarial Networks for
Image Editing and Restoration [72.17890189820665]
Generative adversarial networks (GANs) have drawn enormous attention due to the simple yet effective training mechanism and superior image generation quality.
Recent GAN models have greatly narrowed the gaps between the generated images and the real ones.
Many recent works show emerging interest to take advantage of pre-trained GAN models by exploiting the well-disentangled latent space and the learned GAN priors.
arXiv Detail & Related papers (2022-07-21T05:05:58Z) - Weakly Supervised High-Fidelity Clothing Model Generation [67.32235668920192]
We propose a cheap yet scalable weakly-supervised method called Deep Generative Projection (DGP) to address this specific scenario.
We show that projecting the rough alignment of clothing and body onto the StyleGAN space can yield photo-realistic wearing results.
arXiv Detail & Related papers (2021-12-14T07:15:15Z) - InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z) - MobileStyleGAN: A Lightweight Convolutional Neural Network for
High-Fidelity Image Synthesis [0.0]
We focus on the performance optimization of style-based generative models.
We introduce MobileStyleGAN architecture, which has x3.5 fewer parameters and is x9.5 less computationally complex than StyleGAN2.
arXiv Detail & Related papers (2021-04-10T13:46:49Z) - Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs [57.90008929377144]
We show that state-of-the-art GAN models can be used for a range of applications beyond unconditional image generation.
We achieve this by an iterative scheme that also allows gaining control over the image generation process.
arXiv Detail & Related papers (2020-11-28T11:07:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.