Arbitrary-Scale Image Synthesis
- URL: http://arxiv.org/abs/2204.02273v1
- Date: Tue, 5 Apr 2022 15:10:43 GMT
- Title: Arbitrary-Scale Image Synthesis
- Authors: Evangelos Ntavelis, Mohamad Shahbazi, Iason Kastanis, Radu Timofte,
Martin Danelljan, Luc Van Gool
- Abstract summary: Positional encodings have enabled recent works to train a single adversarial network that can generate images of different scales.
We propose the design of scale-consistent positional encodings invariant to our generator's transformations layers.
We show competitive results for a continuum of scales on various commonly used datasets for image synthesis.
- Score: 149.0290830305808
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Positional encodings have enabled recent works to train a single adversarial
network that can generate images of different scales. However, these approaches
are either limited to a set of discrete scales or struggle to maintain good
perceptual quality at the scales for which the model is not trained explicitly.
We propose the design of scale-consistent positional encodings invariant to our
generator's layers transformations. This enables the generation of
arbitrary-scale images even at scales unseen during training. Moreover, we
incorporate novel inter-scale augmentations into our pipeline and partial
generation training to facilitate the synthesis of consistent images at
arbitrary scales. Lastly, we show competitive results for a continuum of scales
on various commonly used datasets for image synthesis.
Related papers
- Learning Images Across Scales Using Adversarial Training [64.59447233902735]
We devise a novel paradigm for learning a representation that captures an orders-of-magnitude variety of scales from an unstructured collection of ordinary images.
We show that our generator can be used as a multiscale generative model, and for reconstructions of scale spaces from unstructured patches.
arXiv Detail & Related papers (2024-06-13T08:44:12Z) - FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis [48.9652334528436]
We introduce an innovative, training-free approach FouriScale from the perspective of frequency domain analysis.
We replace the original convolutional layers in pre-trained diffusion models by incorporating a dilation technique along with a low-pass operation.
Our method successfully balances the structural integrity and fidelity of generated images, achieving an astonishing capacity of arbitrary-size, high-resolution, and high-quality generation.
arXiv Detail & Related papers (2024-03-19T17:59:33Z) - Scale-Equivariant UNet for Histopathology Image Segmentation [1.213915839836187]
Convolutional Neural Networks (CNNs) trained on such images at a given scale fail to generalise to those at different scales.
We propose the Scale-Equivariant UNet (SEUNet) for image segmentation by building on scale-space theory.
arXiv Detail & Related papers (2023-04-10T14:03:08Z) - DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation [56.514462874501675]
We propose a dynamic sparse attention based Transformer model to achieve fine-level matching with favorable efficiency.
The heart of our approach is a novel dynamic-attention unit, dedicated to covering the variation on the optimal number of tokens one position should focus on.
Experiments on three applications, pose-guided person image generation, edge-based face synthesis, and undistorted image style transfer, demonstrate that DynaST achieves superior performance in local details.
arXiv Detail & Related papers (2022-07-13T11:12:03Z) - High-Resolution Complex Scene Synthesis with Transformers [6.445605125467574]
coarse-grained synthesis of complex scene images via deep generative models has recently gained popularity.
We present an approach to this task, where the generative model is based on pure likelihood training without additional objectives.
We show that the resulting system is able to synthesize high-quality images consistent with the given layouts.
arXiv Detail & Related papers (2021-05-13T17:56:07Z) - Ensembling with Deep Generative Views [72.70801582346344]
generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose.
Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification.
We use StyleGAN2 as the source of generative augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars.
arXiv Detail & Related papers (2021-04-29T17:58:35Z) - Diverse Semantic Image Synthesis via Probability Distribution Modeling [103.88931623488088]
We propose a novel diverse semantic image synthesis framework.
Our method can achieve superior diversity and comparable quality compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-03-11T18:59:25Z) - Nested Scale Editing for Conditional Image Synthesis [19.245119912119947]
We propose an image synthesis approach that provides stratified navigation in the latent code space.
With a tiny amount of partial or very low-resolution image, our approach can consistently out-perform state-of-the-art counterparts.
arXiv Detail & Related papers (2020-06-03T04:29:21Z) - Training End-to-end Single Image Generators without GANs [27.393821783237186]
AugurOne is a novel approach for training single image generative models.
Our approach trains an upscaling neural network using non-affine augmentations of the (single) input image.
A compact latent space is jointly learned allowing for controlled image synthesis.
arXiv Detail & Related papers (2020-04-07T17:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.