Zero-shot Adaptation of Stable Diffusion via Plug-in Hierarchical Degradation Representation for Real-World Super-Resolution
- URL: http://arxiv.org/abs/2512.10340v1
- Date: Thu, 11 Dec 2025 06:45:28 GMT
- Title: Zero-shot Adaptation of Stable Diffusion via Plug-in Hierarchical Degradation Representation for Real-World Super-Resolution
- Authors: Yi-Cheng Liao, Shyang-En Weng, Yu-Syuan Xu, Chi-Wei Hsiao, Wei-Chen Chiu, Ching-Chun Huang,
- Abstract summary: Real-World Image Super-Resolution aims to recover high-quality images from low-quality inputs degraded by unknown and complex real-world factors.<n>textbfHD-CLIP (textbfHierarchical textbfDegradation CLIP) decomposes a low-quality image into a semantic embedding.<n>textbfCLIP can be seamlessly integrated into various super-resolution frameworks without training.
- Score: 22.82705867627899
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-World Image Super-Resolution (Real-ISR) aims to recover high-quality images from low-quality inputs degraded by unknown and complex real-world factors. Real-world scenarios involve diverse and coupled degradations, making it necessary to provide diffusion models with richer and more informative guidance. However, existing methods often assume known degradation severity and rely on CLIP text encoders that cannot capture numerical severity, limiting their generalization ability. To address this, we propose \textbf{HD-CLIP} (\textbf{H}ierarchical \textbf{D}egradation CLIP), which decomposes a low-quality image into a semantic embedding and an ordinal degradation embedding that captures ordered relationships and allows interpolation across unseen levels. Furthermore, we integrated it into diffusion models via classifier-free guidance (CFG) and proposed classifier-free projection guidance (CFPG). HD-CLIP leverages semantic cues to guide generative restoration while using degradation cues to suppress undesired hallucinations and artifacts. As a \textbf{plug-and-play module}, HD-CLIP can be seamlessly integrated into various super-resolution frameworks without training, significantly improving detail fidelity and perceptual realism across diverse real-world datasets.
Related papers
- Mixture of Ranks with Degradation-Aware Routing for One-Step Real-World Image Super-Resolution [76.66229730098759]
In real-world image super-resolution (Real-ISR), existing approaches mainly rely on fine-tuning pre-trained diffusion models.<n>We propose a Mixture-of-Ranks (MoR) architecture for single-step image super-resolution.<n>We introduce a fine-grained expert partitioning strategy that treats each rank in LoRA as an independent expert.
arXiv Detail & Related papers (2025-11-20T04:11:44Z) - Learning to Restore Multi-Degraded Images via Ingredient Decoupling and Task-Aware Path Adaptation [51.10017611491389]
Real-world images often suffer from multiple coexisting degradations, such as rain, noise, and haze coexisting in a single image.<n>We propose an adaptive multi-degradation image restoration network that reconstructs images by leveraging decoupled representations of degradation ingredients.<n>The resulting tightly integrated architecture, termed IMDNet, is extensively validated through experiments.
arXiv Detail & Related papers (2025-11-07T01:50:36Z) - Unsupervised Image Super-Resolution Reconstruction Based on Real-World Degradation Patterns [4.977925450373957]
We propose a novel TripleGAN framework for training super-resolution reconstruction models.<n>The framework learns real-world degradation patterns from LR observations and synthesizes datasets with corresponding degradation characteristics.<n>Our method exhibits clear advantages in quantitative metrics while maintaining sharp reconstructions without over-smoothing artifacts.
arXiv Detail & Related papers (2025-06-20T14:24:48Z) - Manifold-aware Representation Learning for Degradation-agnostic Image Restoration [135.90908995927194]
Image Restoration (IR) aims to recover high quality images from degraded inputs affected by various corruptions such as noise, blur, haze, rain, and low light conditions.<n>We present MIRAGE, a unified framework for all in one IR that explicitly decomposes the input feature space into three semantically aligned parallel branches.<n>This modular decomposition significantly improves generalization and efficiency across diverse degradations.
arXiv Detail & Related papers (2025-05-24T12:52:10Z) - ControlFusion: A Controllable Image Fusion Framework with Language-Vision Degradation Prompts [82.52042409680267]
Current image fusion methods struggle to address the composite degradations encountered in real-world imaging scenarios.<n>We propose a controllable image fusion framework with language-vision prompts, termed ControlFusion.<n>In experiments, ControlFusion outperforms SOTA fusion methods in fusion quality and degradation handling.
arXiv Detail & Related papers (2025-03-30T08:18:53Z) - Content-decoupled Contrastive Learning-based Implicit Degradation Modeling for Blind Image Super-Resolution [33.16889233975723]
Implicit degradation modeling-based blind super-resolution (SR) has attracted more increasing attention in the community.<n>We propose a new Content-decoupled Contrastive Learning-based blind image super-resolution (CdCL) framework.
arXiv Detail & Related papers (2024-08-10T04:51:43Z) - DaLPSR: Leverage Degradation-Aligned Language Prompt for Real-World Image Super-Resolution [19.33582308829547]
This paper proposes to leverage degradation-aligned language prompt for accurate, fine-grained, and high-fidelity image restoration.
The proposed method achieves a new state-of-the-art perceptual quality level.
arXiv Detail & Related papers (2024-06-24T09:30:36Z) - Suppressing Uncertainties in Degradation Estimation for Blind Super-Resolution [31.89605287039615]
The problem of blind image super-resolution aims to recover high-resolution (HR) images from low-resolution (LR) images with unknown degradation modes.
Most existing methods model the image degradation process using blur kernels.
We propose an textbfUncertainty-based degradation representation for blind textbfSuper-textbfResolution framework.
arXiv Detail & Related papers (2024-06-24T08:58:43Z) - Towards Realistic Data Generation for Real-World Super-Resolution [58.99206459754721]
RealDGen is an unsupervised learning data generation framework designed for real-world super-resolution.<n>We develop content and degradation extraction strategies, which are integrated into a novel content-degradation decoupled diffusion model.<n>Experiments demonstrate that RealDGen excels in generating large-scale, high-quality paired data that mirrors real-world degradations.
arXiv Detail & Related papers (2024-06-11T13:34:57Z) - DeeDSR: Towards Real-World Image Super-Resolution via Degradation-Aware Stable Diffusion [27.52552274944687]
We introduce a novel two-stage, degradation-aware framework that enhances the diffusion model's ability to recognize content and degradation in low-resolution images.
In the first stage, we employ unsupervised contrastive learning to obtain representations of image degradations.
In the second stage, we integrate a degradation-aware module into a simplified ControlNet, enabling flexible adaptation to various degradations.
arXiv Detail & Related papers (2024-03-31T12:07:04Z) - Learning Dual-Level Deformable Implicit Representation for Real-World Scale Arbitrary Super-Resolution [81.74583887661794]
We build a new real-world super-resolution benchmark with both integer and non-integer scaling factors.
We propose a Dual-level Deformable Implicit Representation (DDIR) to solve real-world scale arbitrary super-resolution.
Our trained model achieves state-of-the-art performance on the RealArbiSR and RealSR benchmarks for real-world scale arbitrary super-resolution.
arXiv Detail & Related papers (2024-03-16T13:44:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.