OmniScaleSR: Unleashing Scale-Controlled Diffusion Prior for Faithful and Realistic Arbitrary-Scale Image Super-Resolution
- URL: http://arxiv.org/abs/2512.04699v1
- Date: Thu, 04 Dec 2025 11:50:17 GMT
- Title: OmniScaleSR: Unleashing Scale-Controlled Diffusion Prior for Faithful and Realistic Arbitrary-Scale Image Super-Resolution
- Authors: Xinning Chai, Zhengxue Cheng, Yuhong Zhang, Hengsheng Zhang, Yingsheng Qin, Yucai Yang, Rong Xie, Li Song,
- Abstract summary: Arbitrary-scale super-resolution (ASSR) overcomes the limitation of traditional super-resolution (SR) methods that operate only at fixed scales (e.g., 4x)<n>Recent diffusion-based realistic image super-resolution (Real-ISR) models leverage powerful pre-trained diffusion priors and show impressive results at the 4x setting.<n>We propose OmniScaleSR, a diffusion-based realistic arbitrary-scale SR framework designed to achieve both high fidelity and high realism.
- Score: 25.29121925794166
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Arbitrary-scale super-resolution (ASSR) overcomes the limitation of traditional super-resolution (SR) methods that operate only at fixed scales (e.g., 4x), enabling a single model to handle arbitrary magnification. Most existing ASSR approaches rely on implicit neural representation (INR), but its regression-driven feature extraction and aggregation intrinsically limit the ability to synthesize fine details, leading to low realism. Recent diffusion-based realistic image super-resolution (Real-ISR) models leverage powerful pre-trained diffusion priors and show impressive results at the 4x setting. We observe that they can also achieve ASSR because the diffusion prior implicitly adapts to scale by encouraging high-realism generation. However, without explicit scale control, the diffusion process cannot be properly adjusted for different magnification levels, resulting in excessive hallucination or blurry outputs, especially under ultra-high scales. To address these issues, we propose OmniScaleSR, a diffusion-based realistic arbitrary-scale SR framework designed to achieve both high fidelity and high realism. We introduce explicit, diffusion-native scale control mechanisms that work synergistically with implicit scale adaptation, enabling scale-aware and content-aware modulation of the diffusion process. In addition, we incorporate multi-domain fidelity enhancement designs to further improve reconstruction accuracy. Extensive experiments on bicubic degradation benchmarks and real-world datasets show that OmniScaleSR surpasses state-of-the-art methods in both fidelity and perceptual realism, with particularly strong performance at large magnification factors. Code will be released at https://github.com/chaixinning/OmniScaleSR.
Related papers
- InfScene-SR: Spatially Continuous Inference for Arbitrary-Size Image Super-Resolution [3.6762434952581713]
InfScene-SR is a framework enabling spatially continuous super-resolution for large, arbitrary scenes.<n>We adapt the iterative refinement process of diffusion models with a novel guided and variance-corrected fusion mechanism.
arXiv Detail & Related papers (2026-02-23T11:34:59Z) - Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders [74.72147962028265]
Representation Autoencoders (RAEs) have shown distinct advantages in diffusion modeling on ImageNet.<n>We investigate whether this framework can scale to large-scale, freeform text-to-image (T2I) generation.
arXiv Detail & Related papers (2026-01-22T18:58:16Z) - Your Super Resolution Model is not Enough for Tackling Real-World Scenarios [2.101267270902429]
We propose a plug-in Scale-Aware Attention Module (SAAM) designed to retrofit modern fixed-scale SR models with the ability to perform arbitrary-scale SR.<n>SAAM employs lightweight, scale-adaptive feature extraction and upsampling, incorporating the Simple parameter-free Attention Module (SimAM) for efficient guidance and gradient variance loss.<n>Our method integrates seamlessly into multiple state-of-the-art SR backbones, delivering competitive or superior performance across a wide range of integer and non-integer scale factors.
arXiv Detail & Related papers (2025-09-08T07:13:58Z) - Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment [51.99765487172328]
Chain-of-zoom (CoZ) is a framework that factorizes SISR into a chain of intermediate scale-states with multi-scale-aware prompts.<n>Because visual cues diminish at high magnifications, we augment each zoom step with multi-scale-aware text prompts generated by a vision-language model (VLM)<n>Experiments show that a standard 4x diffusion SR model wrapped in CoZ attains beyond 256x enlargement with high perceptual quality and fidelity.
arXiv Detail & Related papers (2025-05-24T08:50:08Z) - $\text{S}^{3}$Mamba: Arbitrary-Scale Super-Resolution via Scaleable State Space Model [45.65903826290642]
ASSR aims to super-resolve low-resolution images to high-resolution images at any scale using a single model.
We propose a novel arbitrary-scale super-resolution method, called $textS3$Mamba, to construct a scalable continuous representation space.
arXiv Detail & Related papers (2024-11-16T11:13:02Z) - Latent Diffusion, Implicit Amplification: Efficient Continuous-Scale Super-Resolution for Remote Sensing Images [7.920423405957888]
E$2$DiffSR achieves superior objective metrics and visual quality compared to the state-of-the-art SR methods.
It reduces the inference time of diffusion-based SR methods to a level comparable to that of non-diffusion methods.
arXiv Detail & Related papers (2024-10-30T09:14:13Z) - Diffusion Models Without Attention [110.5623058129782]
Diffusion State Space Model (DiffuSSM) is an architecture that supplants attention mechanisms with a more scalable state space model backbone.
Our focus on FLOP-efficient architectures in diffusion training marks a significant step forward.
arXiv Detail & Related papers (2023-11-30T05:15:35Z) - Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance.
We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring.
Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z) - Implicit Diffusion Models for Continuous Super-Resolution [65.45848137914592]
This paper introduces an Implicit Diffusion Model (IDM) for high-fidelity continuous image super-resolution.
IDM integrates an implicit neural representation and a denoising diffusion model in a unified end-to-end framework.
The scaling factor regulates the resolution and accordingly modulates the proportion of the LR information and generated features in the final output.
arXiv Detail & Related papers (2023-03-29T07:02:20Z) - Effective Invertible Arbitrary Image Rescaling [77.46732646918936]
Invertible Neural Networks (INN) are able to increase upscaling accuracy significantly by optimizing the downscaling and upscaling cycle jointly.
A simple and effective invertible arbitrary rescaling network (IARN) is proposed to achieve arbitrary image rescaling by training only one model in this work.
It is shown to achieve a state-of-the-art (SOTA) performance in bidirectional arbitrary rescaling without compromising perceptual quality in LR outputs.
arXiv Detail & Related papers (2022-09-26T22:22:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.