Bidirectional Reward-Guided Diffusion for Real-World Image Super-Resolution
- URL: http://arxiv.org/abs/2602.07069v1
- Date: Thu, 05 Feb 2026 19:21:45 GMT
- Title: Bidirectional Reward-Guided Diffusion for Real-World Image Super-Resolution
- Authors: Zihao Fan, Xin Lu, Yidi Liu, Jie Huang, Dong Li, Xueyang Fu, Zheng-Jun Zha,
- Abstract summary: Diffusion-based super-resolution can synthesize rich details, but models trained on synthetic paired data often fail on real-world LR images.<n>We propose Bird-SR, a reward-guided diffusion framework that formulates super-resolution as trajectory-level preference optimization.<n>Experiments on real-world SR benchmarks demonstrate that Bird-SR consistently outperforms state-of-the-art methods in perceptual quality.
- Score: 79.35296000454694
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion-based super-resolution can synthesize rich details, but models trained on synthetic paired data often fail on real-world LR images due to distribution shifts. We propose Bird-SR, a bidirectional reward-guided diffusion framework that formulates super-resolution as trajectory-level preference optimization via reward feedback learning (ReFL), jointly leveraging synthetic LR-HR pairs and real-world LR images. For structural fidelity easily affected in ReFL, the model is directly optimized on synthetic pairs at early diffusion steps, which also facilitates structure preservation for real-world inputs under smaller distribution gap in structure levels. For perceptual enhancement, quality-guided rewards are applied at later sampling steps to both synthetic and real LR images. To mitigate reward hacking, the rewards for synthetic results are formulated in a relative advantage space bounded by their clean counterparts, while real-world optimization is regularized via a semantic alignment constraint. Furthermore, to balance structural and perceptual learning, we adopt a dynamic fidelity-perception weighting strategy that emphasizes structure preservation at early stages and progressively shifts focus toward perceptual optimization at later diffusion steps. Extensive experiments on real-world SR benchmarks demonstrate that Bird-SR consistently outperforms state-of-the-art methods in perceptual quality while preserving structural consistency, validating its effectiveness for real-world super-resolution.
Related papers
- AlignVAR: Towards Globally Consistent Visual Autoregression for Image Super-Resolution [16.90182090355781]
Visual autoregressive models offer stable training, non-iterative inference, and high-fidelity synthesis through next-scale prediction.<n>But their application remains underexplored and faces two critical challenges: locality-biased attention, and residual-only supervision.<n>We propose a globally consistent visual autoregressive framework tailored for image super-resolution.
arXiv Detail & Related papers (2026-02-28T10:39:06Z) - OSDEnhancer: Taming Real-World Space-Time Video Super-Resolution with One-Step Diffusion [64.10689934231165]
Video super-resolution models (DMs) have demonstrated exceptional success in video super-resolution (VSR)<n>Their potential for space-time video super-resolution (STVSR) necessitates recovering realistic visual content from low to high-resolution but also improving the frame rate with coherent dynamics.<n>We propose OSDEnhancer, a framework that represents the first method to initialize real-world STVSR through an efficient one-step diffusion process.<n> Experiments demonstrate that the proposed method achieves state-of-the-art performance while maintaining superior capability in real-world scenarios.
arXiv Detail & Related papers (2026-01-28T06:59:55Z) - Dual-domain Adaptation Networks for Realistic Image Super-resolution [81.34345637776408]
Realistic image super-resolution (SR) focuses on transforming real-world low-resolution (LR) images into high-resolution (HR) ones.<n>Current methods struggle with limited real-world LR-HR data, impacting the learning of basic image features.<n>We introduce a novel approach, which is able to efficiently adapt pre-trained image SR models from simulated to real-world datasets.
arXiv Detail & Related papers (2025-11-21T12:57:23Z) - Mixture of Ranks with Degradation-Aware Routing for One-Step Real-World Image Super-Resolution [76.66229730098759]
In real-world image super-resolution (Real-ISR), existing approaches mainly rely on fine-tuning pre-trained diffusion models.<n>We propose a Mixture-of-Ranks (MoR) architecture for single-step image super-resolution.<n>We introduce a fine-grained expert partitioning strategy that treats each rank in LoRA as an independent expert.
arXiv Detail & Related papers (2025-11-20T04:11:44Z) - Unsupervised Real-World Super-Resolution via Rectified Flow Degradation Modelling [16.485914675883972]
Unsupervised real-world super-resolution (SR) faces critical challenges due to the complex, unknown degradation distributions.<n>Existing methods struggle to generalize from synthetic low-resolution (LR) and high-resolution (HR) image pairs to real-world data.<n>We propose an unsupervised real-world SR method based on rectified flow to effectively capture and model real-world degradation.
arXiv Detail & Related papers (2025-08-10T07:27:28Z) - Unsupervised Image Super-Resolution Reconstruction Based on Real-World Degradation Patterns [4.977925450373957]
We propose a novel TripleGAN framework for training super-resolution reconstruction models.<n>The framework learns real-world degradation patterns from LR observations and synthesizes datasets with corresponding degradation characteristics.<n>Our method exhibits clear advantages in quantitative metrics while maintaining sharp reconstructions without over-smoothing artifacts.
arXiv Detail & Related papers (2025-06-20T14:24:48Z) - AdaptSR: Low-Rank Adaptation for Efficient and Scalable Real-World Super-Resolution [50.584551250242235]
AdaptSR is a low-rank adaptation framework that efficiently repurposes bi-cubic-trained SR models for real-world tasks.<n>Our experiments demonstrate that AdaptSR outperforms GAN and diffusion-based SR methods by up to 4 dB in PSNR and 2% in perceptual scores on real SR benchmarks.
arXiv Detail & Related papers (2025-03-10T18:03:18Z) - Co-learning Single-Step Diffusion Upsampler and Downsampler with Two Discriminators and Distillation [28.174638880324014]
Super-resolution (SR) aims to reconstruct high-resolution (HR) images from their low-resolution (LR) counterparts.<n>We propose a co-learning framework that jointly optimize a single-step diffusion-based upsampler and a learnable downsampler.
arXiv Detail & Related papers (2024-10-10T07:12:46Z) - ICF-SRSR: Invertible scale-Conditional Function for Self-Supervised
Real-world Single Image Super-Resolution [60.90817228730133]
Single image super-resolution (SISR) is a challenging problem that aims to up-sample a given low-resolution (LR) image to a high-resolution (HR) counterpart.
Recent approaches are trained on simulated LR images degraded by simplified down-sampling operators.
We propose a novel Invertible scale-Conditional Function (ICF) which can scale an input image and then restore the original input with different scale conditions.
arXiv Detail & Related papers (2023-07-24T12:42:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.