Investigating GANsformer: A Replication Study of a State-of-the-Art
Image Generation Model
- URL: http://arxiv.org/abs/2303.08577v1
- Date: Wed, 15 Mar 2023 12:51:16 GMT
- Title: Investigating GANsformer: A Replication Study of a State-of-the-Art
Image Generation Model
- Authors: Giorgia Adorni, Felix Boelter, Stefano Carlo Lambertenghi
- Abstract summary: We reproduce and evaluate a novel variation of the original GAN network, the GANformer.
Due to resources and time limitations, we had to constrain the network's training times, dataset types, and sizes.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The field of image generation through generative modelling is abundantly
discussed nowadays. It can be used for various applications, such as up-scaling
existing images, creating non-existing objects, such as interior design scenes,
products or even human faces, and achieving transfer-learning processes. In
this context, Generative Adversarial Networks (GANs) are a class of widely
studied machine learning frameworks first appearing in the paper "Generative
adversarial nets" by Goodfellow et al. that achieve the goal above. In our
work, we reproduce and evaluate a novel variation of the original GAN network,
the GANformer, proposed in "Generative Adversarial Transformers" by Hudson and
Zitnick. This project aimed to recreate the methods presented in this paper to
reproduce the original results and comment on the authors' claims. Due to
resources and time limitations, we had to constrain the network's training
times, dataset types, and sizes. Our research successfully recreated both
variations of the proposed GANformer model and found differences between the
authors' and our results. Moreover, discrepancies between the publication
methodology and the one implemented, made available in the code, allowed us to
study two undisclosed variations of the presented procedures.
Related papers
- RenAIssance: A Survey into AI Text-to-Image Generation in the Era of
Large Model [93.8067369210696]
Text-to-image generation (TTI) refers to the usage of models that could process text input and generate high fidelity images based on text descriptions.
Diffusion models are one prominent type of generative model used for the generation of images through the systematic introduction of noises with repeating steps.
In the era of large models, scaling up model size and the integration with large language models have further improved the performance of TTI models.
arXiv Detail & Related papers (2023-09-02T03:27:20Z) - Diffusion idea exploration for art generation [0.10152838128195467]
Diffusion models have recently outperformed other generative models in image generation tasks using cross modal data as guiding information.
The initial experiments for this task of novel image generation demonstrated promising qualitative results.
arXiv Detail & Related papers (2023-07-11T02:35:26Z) - Textile Pattern Generation Using Diffusion Models [0.0]
This study presents a fine-tuned diffusion model specifically trained for textile pattern generation by text guidance.
The proposed fine-tuned diffusion model outperforms the baseline models in terms of pattern quality and efficiency in textile pattern generation by text guidance.
arXiv Detail & Related papers (2023-04-02T12:12:24Z) - IRGen: Generative Modeling for Image Retrieval [82.62022344988993]
In this paper, we present a novel methodology, reframing image retrieval as a variant of generative modeling.
We develop our model, dubbed IRGen, to address the technical challenge of converting an image into a concise sequence of semantic units.
Our model achieves state-of-the-art performance on three widely-used image retrieval benchmarks and two million-scale datasets.
arXiv Detail & Related papers (2023-03-17T17:07:36Z) - InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z) - Meta Internal Learning [88.68276505511922]
Internal learning for single-image generation is a framework, where a generator is trained to produce novel images based on a single image.
We propose a meta-learning approach that enables training over a collection of images, in order to model the internal statistics of the sample image more effectively.
Our results show that the models obtained are as suitable as single-image GANs for many common image applications.
arXiv Detail & Related papers (2021-10-06T16:27:38Z) - MOGAN: Morphologic-structure-aware Generative Learning from a Single
Image [59.59698650663925]
Recently proposed generative models complete training based on only one image.
We introduce a MOrphologic-structure-aware Generative Adversarial Network named MOGAN that produces random samples with diverse appearances.
Our approach focuses on internal features including the maintenance of rational structures and variation on appearance.
arXiv Detail & Related papers (2021-03-04T12:45:23Z) - Image Synthesis with Adversarial Networks: a Comprehensive Survey and
Case Studies [41.00383742615389]
Generative Adversarial Networks (GANs) have been extremely successful in various application domains such as computer vision, medicine, and natural language processing.
GANs are powerful models for learning complex distributions to synthesize semantically meaningful samples.
Given the current fast GANs development, in this survey, we provide a comprehensive review of adversarial models for image synthesis.
arXiv Detail & Related papers (2020-12-26T13:30:42Z) - XingGAN for Person Image Generation [149.54517767056382]
We propose a novel Generative Adversarial Network (XingGAN) for person image generation tasks.
XingGAN consists of two generation branches that model the person's appearance and shape information.
We show that the proposed XingGAN advances the state-of-the-art performance in terms of objective quantitative scores and subjective visual realness.
arXiv Detail & Related papers (2020-07-17T23:40:22Z) - Network-to-Network Translation with Conditional Invertible Neural
Networks [19.398202091883366]
Recent work suggests that the power of massive machine learning models is captured by the representations they learn.
We seek a model that can relate between different existing representations and propose to solve this task with a conditionally invertible network.
Our domain transfer network can translate between fixed representations without having to learn or finetune them.
arXiv Detail & Related papers (2020-05-27T18:14:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.