On Unsupervised Image-to-image translation and GAN stability
- URL: http://arxiv.org/abs/2403.09646v1
- Date: Wed, 18 Oct 2023 04:00:43 GMT
- Title: On Unsupervised Image-to-image translation and GAN stability
- Authors: BahaaEddin AlAila, Zahra Jandaghi, Abolfazl Farahani, Mohammad Ziad Al-Saad,
- Abstract summary: We study some of the failure cases of a seminal work in the field, CycleGAN.
We propose two general models to try to alleviate these problems.
- Score: 0.5523170464803535
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problem of image-to-image translation is one that is intruiging and challenging at the same time, for the impact potential it can have on a wide variety of other computer vision applications like colorization, inpainting, segmentation and others. Given the high-level of sophistication needed to extract patterns from one domain and successfully applying them to another, especially, in a completely unsupervised (unpaired) manner, this problem has gained much attention as of the last few years. It is one of the first problems where successful applications to deep generative models, and especially Generative Adversarial Networks achieved astounding results that are actually of realworld impact, rather than just a show of theoretical prowess; the such that has been dominating the GAN world. In this work, we study some of the failure cases of a seminal work in the field, CycleGAN [1] and hypothesize that they are GAN-stability related, and propose two general models to try to alleviate these problems. We also reach the same conclusion of the problem being ill-posed that has been also circulating in the literature lately.
Related papers
- A Preliminary Exploration Towards General Image Restoration [48.02907312223344]
We present a new problem called general image restoration (GIR) which aims to address these challenges within a unified model.
GIR covers most individual image restoration tasks (eg, image denoising, deblurring, deraining and super-resolution) and their combinations for general purposes.
We conduct a comprehensive evaluation of existing approaches for tackling the GIR challenge, illuminating their strengths and pragmatic challenges.
arXiv Detail & Related papers (2024-08-27T15:31:45Z) - DualAnoDiff: Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation [40.257604426546216]
The performance of anomaly inspection in industrial manufacturing is constrained by the scarcity of anomaly data.
Existing anomaly generation methods suffer from limited diversity in the generated anomalies.
We propose DualAnoDiff, a novel diffusion-based few-shot anomaly image generation model.
arXiv Detail & Related papers (2024-08-24T08:09:32Z) - Separate-and-Enhance: Compositional Finetuning for Text2Image Diffusion
Models [58.46926334842161]
This work illuminates the fundamental reasons for such misalignment, pinpointing issues related to low attention activation scores and mask overlaps.
We propose two novel objectives, the Separate loss and the Enhance loss, that reduce object mask overlaps and maximize attention scores.
Our method diverges from conventional test-time-adaptation techniques, focusing on finetuning critical parameters, which enhances scalability and generalizability.
arXiv Detail & Related papers (2023-12-10T22:07:42Z) - GAN-based Algorithm for Efficient Image Inpainting [0.0]
Global pandemic has post challenges in a new dimension on facial recognition, where people start to wear masks.
Under such condition, the authors consider utilizing machine learning in image inpainting to tackle the problem.
In particular, autoencoder has great potential on retaining important, general features of the image.
arXiv Detail & Related papers (2023-09-13T20:28:54Z) - Causal Triplet: An Open Challenge for Intervention-centric Causal
Representation Learning [98.78136504619539]
Causal Triplet is a causal representation learning benchmark featuring visually more complex scenes.
We show that models built with the knowledge of disentangled or object-centric representations significantly outperform their distributed counterparts.
arXiv Detail & Related papers (2023-01-12T17:43:38Z) - Generative Adversarial Networks [43.10140199124212]
Generative Adversarial Networks (GANs) are very popular frameworks for generating high-quality data.
This chapter gives an introduction to GANs, by discussing their principle mechanism and presenting some of their inherent problems during training and evaluation.
arXiv Detail & Related papers (2022-03-01T18:37:48Z) - Beyond ImageNet Attack: Towards Crafting Adversarial Examples for
Black-box Domains [80.11169390071869]
Adversarial examples have posed a severe threat to deep neural networks due to their transferable nature.
We propose a Beyond ImageNet Attack (BIA) to investigate the transferability towards black-box domains.
Our methods outperform state-of-the-art approaches by up to 7.71% (towards coarse-grained domains) and 25.91% (towards fine-grained domains) on average.
arXiv Detail & Related papers (2022-01-27T14:04:27Z) - Unsupervised Image Generation with Infinite Generative Adversarial
Networks [24.41144953504398]
We propose a new unsupervised non-parametric method named mixture of infinite conditional GANs or MIC-GANs.
We show that MIC-GANs are effective in structuring the latent space and avoiding mode collapse, and outperform state-of-the-art methods.
arXiv Detail & Related papers (2021-08-18T05:03:19Z) - Heterogeneous Face Frontalization via Domain Agnostic Learning [74.86585699909459]
We propose a domain agnostic learning-based generative adversarial network (DAL-GAN) which can synthesize frontal views in the visible domain from thermal faces with pose variations.
DAL-GAN consists of a generator with an auxiliary classifier and two discriminators which capture both local and global texture discriminations for better synthesis.
arXiv Detail & Related papers (2021-07-17T20:41:41Z) - Rethinking conditional GAN training: An approach using geometrically
structured latent manifolds [58.07468272236356]
Conditional GANs (cGAN) suffer from critical drawbacks such as the lack of diversity in generated outputs.
We propose a novel training mechanism that increases both the diversity and the visual quality of a vanilla cGAN.
arXiv Detail & Related papers (2020-11-25T22:54:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.