Related papers: LinearSR: Unlocking Linear Attention for Stable and Efficient Image Super-Resolution

LinearSR: Unlocking Linear Attention for Stable and Efficient Image Super-Resolution

URL: http://arxiv.org/abs/2510.08771v2
Date: Thu, 30 Oct 2025 14:46:21 GMT
Title: LinearSR: Unlocking Linear Attention for Stable and Efficient Image Super-Resolution
Authors: Xiaohui Li, Shaobin Zhuang, Shuo Cao, Yang Yang, Yuandong Pu, Qi Qin, Siqi Luo, Bin Fu, Yihao Liu,
Abstract summary: Generative models for Image Super-Resolution (SR) are increasingly powerful, yet their reliance on self-attention's quadratic complexity (O(N2)) creates a major computational bottleneck.<n> Linear Attention offers an O(N) solution, but its promise for photorealistic SR has remained largely untapped.<n>This paper introduces LinearSR, a holistic framework that, for the first time, systematically overcomes these critical hurdles.
Score: 24.44080642253128
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative models for Image Super-Resolution (SR) are increasingly powerful, yet their reliance on self-attention's quadratic complexity (O(N^2)) creates a major computational bottleneck. Linear Attention offers an O(N) solution, but its promise for photorealistic SR has remained largely untapped, historically hindered by a cascade of interrelated and previously unsolved challenges. This paper introduces LinearSR, a holistic framework that, for the first time, systematically overcomes these critical hurdles. Specifically, we resolve a fundamental, training instability that causes catastrophic model divergence using our novel "knee point"-based Early-Stopping Guided Fine-tuning (ESGF) strategy. Furthermore, we mitigate the classic perception-distortion trade-off with a dedicated SNR-based Mixture of Experts (MoE) architecture. Finally, we establish an effective and lightweight guidance paradigm, TAG, derived from our "precision-over-volume" principle. Our resulting LinearSR model simultaneously delivers state-of-the-art perceptual quality with exceptional efficiency. Its core diffusion forward pass (1-NFE) achieves SOTA-level speed, while its overall multi-step inference time remains highly competitive. This work provides the first robust methodology for applying Linear Attention in the photorealistic SR domain, establishing a foundational paradigm for future research in efficient generative super-resolution.

Related papers

Joint Geometric and Trajectory Consistency Learning for One-Step Real-World Super-Resolution [14.52346301984322]
Diffusion-based Real-World Image Super-Resolution (Real-ISR) achieves impressive perceptual quality but suffers from high computational costs due to iterative sampling.<n>We propose GTASR (Geometric Trajectory Alignment Super-Resolution), a simple yet effective consistency training paradigm for Real-ISR.
arXiv Detail & Related papers (2026-02-27T18:13:31Z)
InfScene-SR: Spatially Continuous Inference for Arbitrary-Size Image Super-Resolution [3.6762434952581713]
InfScene-SR is a framework enabling spatially continuous super-resolution for large, arbitrary scenes.<n>We adapt the iterative refinement process of diffusion models with a novel guided and variance-corrected fusion mechanism.
arXiv Detail & Related papers (2026-02-23T11:34:59Z)
OSDEnhancer: Taming Real-World Space-Time Video Super-Resolution with One-Step Diffusion [64.10689934231165]
Video super-resolution models (DMs) have demonstrated exceptional success in video super-resolution (VSR)<n>Their potential for space-time video super-resolution (STVSR) necessitates recovering realistic visual content from low to high-resolution but also improving the frame rate with coherent dynamics.<n>We propose OSDEnhancer, a framework that represents the first method to initialize real-world STVSR through an efficient one-step diffusion process.<n> Experiments demonstrate that the proposed method achieves state-of-the-art performance while maintaining superior capability in real-world scenarios.
arXiv Detail & Related papers (2026-01-28T06:59:55Z)
Dual-domain Adaptation Networks for Realistic Image Super-resolution [81.34345637776408]
Realistic image super-resolution (SR) focuses on transforming real-world low-resolution (LR) images into high-resolution (HR) ones.<n>Current methods struggle with limited real-world LR-HR data, impacting the learning of basic image features.<n>We introduce a novel approach, which is able to efficiently adapt pre-trained image SR models from simulated to real-world datasets.
arXiv Detail & Related papers (2025-11-21T12:57:23Z)
Consistency Trajectory Matching for One-Step Generative Super-Resolution [19.08324232157866]
Current diffusion-based super-resolution approaches achieve commendable performance at the cost of high inference overhead.<n>We propose Consistency Trajectory Matching for Super-Resolution (CTMSR), a distillation-free strategy that is able to generate photo-realistic SR results in one step.<n>We show that the proposed methods can attain comparable or even superior capabilities on both synthetic and real datasets.
arXiv Detail & Related papers (2025-03-26T09:20:42Z)
Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors [75.24313405671433]
Diffusion-based image super-resolution (SR) methods have achieved remarkable success by leveraging large pre-trained text-to-image diffusion models as priors. We introduce a novel one-step SR model, which significantly addresses the efficiency issue of diffusion-based SR methods. Unlike existing fine-tuning strategies, we designed a degradation-guided Low-Rank Adaptation (LoRA) module specifically for SR.
arXiv Detail & Related papers (2024-09-25T16:15:21Z)
RBSR: Efficient and Flexible Recurrent Network for Burst Super-Resolution [57.98314517861539]
Burst super-resolution (BurstSR) aims at reconstructing a high-resolution (HR) image from a sequence of low-resolution (LR) and noisy images. In this paper, we suggest fusing cues frame-by-frame with an efficient and flexible recurrent network.
arXiv Detail & Related papers (2023-06-30T12:14:13Z)
Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures. This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead. We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z)
DeepRLS: A Recurrent Network Architecture with Least Squares Implicit Layers for Non-blind Image Deconvolution [15.986942312624]
We study the problem of non-blind image deconvolution. We propose a novel recurrent network architecture that leads to very competitive restoration results of high image quality.
arXiv Detail & Related papers (2021-12-10T13:16:51Z)
Characteristic Regularisation for Super-Resolving Face Images [81.84939112201377]
Existing facial image super-resolution (SR) methods focus mostly on improving artificially down-sampled low-resolution (LR) imagery. Previous unsupervised domain adaptation (UDA) methods address this issue by training a model using unpaired genuine LR and HR data. This renders the model overstretched with two tasks: consistifying the visual characteristics and enhancing the image resolution. We formulate a method that joins the advantages of conventional SR and UDA models.
arXiv Detail & Related papers (2019-12-30T16:27:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.