Large Scale Image Completion via Co-Modulated Generative Adversarial
Networks
- URL: http://arxiv.org/abs/2103.10428v1
- Date: Thu, 18 Mar 2021 17:59:11 GMT
- Title: Large Scale Image Completion via Co-Modulated Generative Adversarial
Networks
- Authors: Shengyu Zhao, Jonathan Cui, Yilun Sheng, Yue Dong, Xiao Liang, Eric I
Chang, Yan Xu
- Abstract summary: We propose a generic new approach that bridges the gap between image-conditional and recent unconditional generative architectures.
Also, due to the lack of good quantitative metrics for image completion, we propose the new Paired/Unpaired Inception Discriminative Score (P-IDS/U-IDS)
Experiments demonstrate superior performance in terms of both quality and diversity over state-of-the-art methods in free-form image completion and easy generalization to image-to-image translation.
- Score: 18.312552957727828
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Numerous task-specific variants of conditional generative adversarial
networks have been developed for image completion. Yet, a serious limitation
remains that all existing algorithms tend to fail when handling large-scale
missing regions. To overcome this challenge, we propose a generic new approach
that bridges the gap between image-conditional and recent modulated
unconditional generative architectures via co-modulation of both conditional
and stochastic style representations. Also, due to the lack of good
quantitative metrics for image completion, we propose the new Paired/Unpaired
Inception Discriminative Score (P-IDS/U-IDS), which robustly measures the
perceptual fidelity of inpainted images compared to real images via linear
separability in a feature space. Experiments demonstrate superior performance
in terms of both quality and diversity over state-of-the-art methods in
free-form image completion and easy generalization to image-to-image
translation. Code is available at https://github.com/zsyzzsoft/co-mod-gan.
Related papers
- Modification Takes Courage: Seamless Image Stitching via Reference-Driven Inpainting [0.17975553762582286]
Current image stitching methods produce noticeable seams in challenging scenarios such as uneven hue and large parallax.
We propose the Reference-Driven Inpainting Stitcher (RDIStitcher) to reformulate the image fusion and rectangling as a reference-based inpainting model.
We present the Multimodal Large Language Models (MLLMs)-based metrics, offering a new perspective on evaluating stitched image quality.
arXiv Detail & Related papers (2024-11-15T16:05:01Z) - Unified-Width Adaptive Dynamic Network for All-In-One Image Restoration [50.81374327480445]
We introduce a novel concept positing that intricate image degradation can be represented in terms of elementary degradation.
We propose the Unified-Width Adaptive Dynamic Network (U-WADN), consisting of two pivotal components: a Width Adaptive Backbone (WAB) and a Width Selector (WS)
The proposed U-WADN achieves better performance while simultaneously reducing up to 32.3% of FLOPs and providing approximately 15.7% real-time acceleration.
arXiv Detail & Related papers (2024-01-24T04:25:12Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Coarse-to-Fine: Learning Compact Discriminative Representation for
Single-Stage Image Retrieval [11.696941841000985]
Two-stage methods following retrieve-and-rerank paradigm have achieved excellent performance, but their separate local and global modules are inefficient to real-world applications.
We propose a mechanism which attentively selects prominent local descriptors and infuse fine-grained semantic relations into the global representation.
Our method achieves state-of-the-art single-stage image retrieval performance on benchmarks such as Revisited Oxford and Revisited Paris.
arXiv Detail & Related papers (2023-08-08T03:06:10Z) - Auto-regressive Image Synthesis with Integrated Quantization [55.51231796778219]
This paper presents a versatile framework for conditional image generation.
It incorporates the inductive bias of CNNs and powerful sequence modeling of auto-regression.
Our method achieves superior diverse image generation performance as compared with the state-of-the-art.
arXiv Detail & Related papers (2022-07-21T22:19:17Z) - High-Quality Pluralistic Image Completion via Code Shared VQGAN [51.7805154545948]
We present a novel framework for pluralistic image completion that can achieve both high quality and diversity at much faster inference speed.
Our framework is able to learn semantically-rich discrete codes efficiently and robustly, resulting in much better image reconstruction quality.
arXiv Detail & Related papers (2022-04-05T01:47:35Z) - CM-GAN: Image Inpainting with Cascaded Modulation GAN and Object-Aware
Training [112.96224800952724]
We propose cascaded modulation GAN (CM-GAN) to generate plausible image structures when dealing with large holes in complex images.
In each decoder block, global modulation is first applied to perform coarse semantic-aware synthesis structure, then spatial modulation is applied on the output of global modulation to further adjust the feature map in a spatially adaptive fashion.
In addition, we design an object-aware training scheme to prevent the network from hallucinating new objects inside holes, fulfilling the needs of object removal tasks in real-world scenarios.
arXiv Detail & Related papers (2022-03-22T16:13:27Z) - Cascading Modular Network (CAM-Net) for Multimodal Image Synthesis [7.726465518306907]
A persistent challenge has been to generate diverse versions of output images from the same input image.
We propose CAM-Net, a unified architecture that can be applied to a broad range of tasks.
It is capable of generating convincing high frequency details, achieving a reduction of the Frechet Inception Distance (FID) by up to 45.3% compared to the baseline.
arXiv Detail & Related papers (2021-06-16T17:58:13Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.