Lipschitz Regularized CycleGAN for Improving Semantic Robustness in
Unpaired Image-to-image Translation
- URL: http://arxiv.org/abs/2012.04932v1
- Date: Wed, 9 Dec 2020 09:28:53 GMT
- Title: Lipschitz Regularized CycleGAN for Improving Semantic Robustness in
Unpaired Image-to-image Translation
- Authors: Zhiwei Jia, Bodi Yuan, Kangkang Wang, Hong Wu, David Clifford,
Zhiqiang Yuan, Hao Su
- Abstract summary: For unpaired image-to-image translation tasks, GAN-based approaches are susceptible to semantic flipping.
We propose a novel approach, Lipschitz regularized CycleGAN, for improving semantic robustness.
We evaluate our approach on multiple common datasets and compare with several existing GAN-based methods.
- Score: 19.083671868521918
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: For unpaired image-to-image translation tasks, GAN-based approaches are
susceptible to semantic flipping, i.e., contents are not preserved
consistently. We argue that this is due to (1) the difference in semantic
statistics between source and target domains and (2) the learned generators
being non-robust. In this paper, we proposed a novel approach, Lipschitz
regularized CycleGAN, for improving semantic robustness and thus alleviating
the semantic flipping issue. During training, we add a gradient penalty loss to
the generators, which encourages semantically consistent transformations. We
evaluate our approach on multiple common datasets and compare with several
existing GAN-based methods. Both quantitative and visual results suggest the
effectiveness and advantage of our approach in producing robust transformations
with fewer semantic flipping.
Related papers
- SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow [94.90853153808987]
We propose a unified diffusion-based framework (SemFlow) for semantic segmentation and semantic image synthesis.
As the training object is symmetric, samples belonging to the two distributions, images and semantic masks, can be effortlessly transferred reversibly.
Experiments show that our SemFlow achieves competitive results on semantic segmentation and semantic image synthesis tasks.
arXiv Detail & Related papers (2024-05-30T17:34:40Z) - StegoGAN: Leveraging Steganography for Non-Bijective Image-to-Image Translation [18.213286385769525]
CycleGAN-based methods are known to hide the mismatched information in the generated images to bypass cycle consistency objectives.
We introduce StegoGAN, a novel model that leverages steganography to prevent spurious features in generated images.
Our approach enhances the semantic consistency of the translated images without requiring additional postprocessing or supervision.
arXiv Detail & Related papers (2024-03-29T12:23:58Z) - General Lipschitz: Certified Robustness Against Resolvable Semantic
Transformations via Transformation-Dependent Randomized Smoothing [6.101765622702223]
We propose emphGeneral Lipschitz (GL), a new framework to certify neural networks against composable resolvable semantic perturbations.
Our method performs comparably to state-of-the-art approaches on the ImageNet dataset.
arXiv Detail & Related papers (2023-08-17T14:39:24Z) - Identical and Fraternal Twins: Fine-Grained Semantic Contrastive
Learning of Sentence Representations [6.265789210037749]
We introduce a novel Identical and Fraternal Twins of Contrastive Learning framework, capable of simultaneously adapting to various positive pairs generated by different augmentation techniques.
We also present proof-of-concept experiments combined with the contrastive objective to prove the validity of the proposed Twins Loss.
arXiv Detail & Related papers (2023-07-20T15:02:42Z) - Improving Diffusion-based Image Translation using Asymmetric Gradient
Guidance [51.188396199083336]
We present an approach that guides the reverse process of diffusion sampling by applying asymmetric gradient guidance.
Our model's adaptability allows it to be implemented with both image-fusion and latent-dif models.
Experiments show that our method outperforms various state-of-the-art models in image translation tasks.
arXiv Detail & Related papers (2023-06-07T12:56:56Z) - Towards More Robust Interpretation via Local Gradient Alignment [37.464250451280336]
We show that for every non-negative homogeneous neural network, a naive $ell$-robust criterion for gradients is textitnot normalization invariant.
We propose to combine both $ell$ and cosine distance-based criteria as regularization terms to leverage the advantages of both in aligning the local gradient.
We experimentally show that models trained with our method produce much more robust interpretations on CIFAR-10 and ImageNet-100.
arXiv Detail & Related papers (2022-11-29T03:38:28Z) - GSmooth: Certified Robustness against Semantic Transformations via
Generalized Randomized Smoothing [40.38555458216436]
We propose a unified theoretical framework for certifying robustness against general semantic transformations.
Under the GSmooth framework, we present a scalable algorithm that uses a surrogate image-to-image network to approximate the complex transformation.
arXiv Detail & Related papers (2022-06-09T07:12:17Z) - Diverse Semantic Image Synthesis via Probability Distribution Modeling [103.88931623488088]
We propose a novel diverse semantic image synthesis framework.
Our method can achieve superior diversity and comparable quality compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-03-11T18:59:25Z) - FDA: Fourier Domain Adaptation for Semantic Segmentation [82.4963423086097]
We describe a simple method for unsupervised domain adaptation, whereby the discrepancy between the source and target distributions is reduced by swapping the low-frequency spectrum of one with the other.
We illustrate the method in semantic segmentation, where densely annotated images are aplenty in one domain, but difficult to obtain in another.
Our results indicate that even simple procedures can discount nuisance variability in the data that more sophisticated methods struggle to learn away.
arXiv Detail & Related papers (2020-04-11T22:20:48Z) - Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence
Lip-Reading [96.48553941812366]
Lip-reading aims to infer the speech content from the lip movement sequence.
Traditional learning process of seq2seq models suffers from two problems.
We propose a novel pseudo-convolutional policy gradient (PCPG) based method to address these two problems.
arXiv Detail & Related papers (2020-03-09T09:12:26Z) - Adaptive Correlated Monte Carlo for Contextual Categorical Sequence
Generation [77.7420231319632]
We adapt contextual generation of categorical sequences to a policy gradient estimator, which evaluates a set of correlated Monte Carlo (MC) rollouts for variance control.
We also demonstrate the use of correlated MC rollouts for binary-tree softmax models, which reduce the high generation cost in large vocabulary scenarios.
arXiv Detail & Related papers (2019-12-31T03:01:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.