Learning Images Across Scales Using Adversarial Training
- URL: http://arxiv.org/abs/2406.08924v1
- Date: Thu, 13 Jun 2024 08:44:12 GMT
- Title: Learning Images Across Scales Using Adversarial Training
- Authors: Krzysztof Wolski, Adarsh Djeacoumar, Alireza Javanmardi, Hans-Peter Seidel, Christian Theobalt, Guillaume Cordonnier, Karol Myszkowski, George Drettakis, Xingang Pan, Thomas Leimkühler,
- Abstract summary: We devise a novel paradigm for learning a representation that captures an orders-of-magnitude variety of scales from an unstructured collection of ordinary images.
We show that our generator can be used as a multiscale generative model, and for reconstructions of scale spaces from unstructured patches.
- Score: 64.59447233902735
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The real world exhibits rich structure and detail across many scales of observation. It is difficult, however, to capture and represent a broad spectrum of scales using ordinary images. We devise a novel paradigm for learning a representation that captures an orders-of-magnitude variety of scales from an unstructured collection of ordinary images. We treat this collection as a distribution of scale-space slices to be learned using adversarial training, and additionally enforce coherency across slices. Our approach relies on a multiscale generator with carefully injected procedural frequency content, which allows to interactively explore the emerging continuous scale space. Training across vastly different scales poses challenges regarding stability, which we tackle using a supervision scheme that involves careful sampling of scales. We show that our generator can be used as a multiscale generative model, and for reconstructions of scale spaces from unstructured patches. Significantly outperforming the state of the art, we demonstrate zoom-in factors of up to 256x at high quality and scale consistency.
Related papers
- Cross-Scale Pansharpening via ScaleFormer and the PanScale Benchmark [39.78977567741962]
Pansharpening aims to generate high-resolution multi-spectral images by fusing the spatial detail of panchromatic images with the spectral richness of low-resolution MS data.<n>Existing methods are evaluated under limited, low-resolution settings, limiting their generalization to real-world, high-resolution scenarios.<n>We introduce PanScale, the first large-scale, cross-scale pansharpening dataset, accompanied by PanScale-Bench, a benchmark for evaluating generalization across varying resolutions and scales.
arXiv Detail & Related papers (2026-02-28T08:44:34Z) - Progressive Checkerboards for Autoregressive Multiscale Image Generation [0.0]
A key challenge in autoregressive image generation is to efficiently sample independent locations in parallel.<n>In this work we examine a flexible, fixed ordering based on progressive checkerboards for multiscale autoregressive image generation.<n>We find evidence that in our balanced setting, a wide range of scale-up factors lead to similar results, so long as the total number of serial steps is constant.
arXiv Detail & Related papers (2026-02-03T18:15:27Z) - FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis [48.9652334528436]
We introduce an innovative, training-free approach FouriScale from the perspective of frequency domain analysis.
We replace the original convolutional layers in pre-trained diffusion models by incorporating a dilation technique along with a low-pass operation.
Our method successfully balances the structural integrity and fidelity of generated images, achieving an astonishing capacity of arbitrary-size, high-resolution, and high-quality generation.
arXiv Detail & Related papers (2024-03-19T17:59:33Z) - Learned representation-guided diffusion models for large-image generation [58.192263311786824]
We introduce a novel approach that trains diffusion models conditioned on embeddings from self-supervised learning (SSL)
Our diffusion models successfully project these features back to high-quality histopathology and remote sensing images.
Augmenting real data by generating variations of real images improves downstream accuracy for patch-level and larger, image-scale classification tasks.
arXiv Detail & Related papers (2023-12-12T14:45:45Z) - Generative Powers of Ten [60.6740997942711]
We present a method that uses a text-to-image model to generate consistent content across multiple image scales.
We achieve this through a joint multi-scale diffusion sampling approach.
Our method enables deeper levels of zoom than traditional super-resolution methods.
arXiv Detail & Related papers (2023-12-04T18:59:25Z) - Dual Pyramid Generative Adversarial Networks for Semantic Image
Synthesis [94.76988562653845]
The goal of semantic image synthesis is to generate photo-realistic images from semantic label maps.
Current state-of-the-art approaches, however, still struggle to generate realistic objects in images at various scales.
We propose a Dual Pyramid Generative Adversarial Network (DP-GAN) that learns the conditioning of spatially-adaptive normalization blocks at all scales jointly.
arXiv Detail & Related papers (2022-10-08T18:45:44Z) - Scale Attention for Learning Deep Face Representation: A Study Against
Visual Scale Variation [69.45176408639483]
We reform the conv layer by resorting to the scale-space theory.
We build a novel style named SCale AttentioN Conv Neural Network (textbfSCAN-CNN)
As a single-shot scheme, the inference is more efficient than multi-shot fusion.
arXiv Detail & Related papers (2022-09-19T06:35:04Z) - Arbitrary-Scale Image Synthesis [149.0290830305808]
Positional encodings have enabled recent works to train a single adversarial network that can generate images of different scales.
We propose the design of scale-consistent positional encodings invariant to our generator's transformations layers.
We show competitive results for a continuum of scales on various commonly used datasets for image synthesis.
arXiv Detail & Related papers (2022-04-05T15:10:43Z) - Nested Scale Editing for Conditional Image Synthesis [19.245119912119947]
We propose an image synthesis approach that provides stratified navigation in the latent code space.
With a tiny amount of partial or very low-resolution image, our approach can consistently out-perform state-of-the-art counterparts.
arXiv Detail & Related papers (2020-06-03T04:29:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.