Compressing Deep Image Super-resolution Models
- URL: http://arxiv.org/abs/2401.00523v2
- Date: Wed, 21 Feb 2024 20:25:53 GMT
- Title: Compressing Deep Image Super-resolution Models
- Authors: Yuxuan Jiang, Jakub Nawala, Fan Zhang, and David Bull
- Abstract summary: This work employs a three-stage workflow for compressing deep SR models which significantly reduces their memory requirement.
We have applied this approach to two popular image super-resolution networks, SwinIR and EDSR, to demonstrate its effectiveness.
The resulting compact models, SwinIRmini and EDSRmini, attain an 89% and 96% reduction in both model size and floating-point operations.
- Score: 2.895266689123347
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Deep learning techniques have been applied in the context of image
super-resolution (SR), achieving remarkable advances in terms of reconstruction
performance. Existing techniques typically employ highly complex model
structures which result in large model sizes and slow inference speeds. This
often leads to high energy consumption and restricts their adoption for
practical applications. To address this issue, this work employs a three-stage
workflow for compressing deep SR models which significantly reduces their
memory requirement. Restoration performance has been maintained through
teacher-student knowledge distillation using a newly designed distillation
loss. We have applied this approach to two popular image super-resolution
networks, SwinIR and EDSR, to demonstrate its effectiveness. The resulting
compact models, SwinIRmini and EDSRmini, attain an 89% and 96% reduction in
both model size and floating-point operations (FLOPs) respectively, compared to
their original versions. They also retain competitive super-resolution
performance compared to their original models and other commonly used SR
approaches. The source code and pre-trained models for these two lightweight SR
approaches are released at https://pikapi22.github.io/CDISM/.
Related papers
- SimpleGVR: A Simple Baseline for Latent-Cascaded Video Super-Resolution [55.14432034345353]
We study key design principles for latter cascaded video super-resolution models, which are underexplored currently.<n>First, we propose two strategies to generate training pairs that better mimic the output characteristics of the base model, ensuring alignment between the VSR model and its upstream generator.<n>Second, we provide critical insights into VSR model behavior through systematic analysis of (1) timestep sampling strategies, (2) noise augmentation effects on low-resolution (LR) inputs.
arXiv Detail & Related papers (2025-06-24T17:57:26Z) - Distillation-Supervised Convolutional Low-Rank Adaptation for Efficient Image Super-Resolution [19.22142805041799]
Convolutional neural networks (CNNs) have been widely used in efficient image super-resolution.
We propose Distillation-Supervised Convolutional Low-Rank Adaptation (DSCLoRA), which improves model performance without increasing architectural complexity or inference costs.
arXiv Detail & Related papers (2025-04-15T15:12:57Z) - ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration [75.0053551643052]
We introduce ZipIR, a novel framework that enhances efficiency, scalability, and long-range modeling for high-res image restoration.
ZipIR employs a highly compressed latent representation that compresses image 32x, effectively reducing the number of spatial tokens.
ZipIR surpasses existing diffusion-based methods, offering unmatched speed and quality in restoring high-resolution images from severely degraded inputs.
arXiv Detail & Related papers (2025-04-11T14:49:52Z) - TSD-SR: One-Step Diffusion with Target Score Distillation for Real-World Image Super-Resolution [25.994093587158808]
Pre-trained text-to-image diffusion models are increasingly applied to real-world image super-resolution (Real-ISR) tasks.
Given the iterative refinement nature of diffusion models, most existing approaches are computationally expensive.
We propose TSD-SR, a novel distillation framework specifically designed for real-world image super-resolution.
arXiv Detail & Related papers (2024-11-27T12:01:08Z) - Adversarial Diffusion Compression for Real-World Image Super-Resolution [16.496532580598007]
Real-world image super-resolution aims to reconstruct high-resolution images from low-resolution inputs degraded by complex processes.
One-step diffusion networks like OSEDiff and S3Diff alleviate this issue but still incur high computational costs.
This paper proposes a novel Real-ISR method, AdcSR, by distilling the one-step diffusion network OSEDiff into a streamlined diffusion-GAN model.
arXiv Detail & Related papers (2024-11-20T15:13:36Z) - Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors [75.24313405671433]
Diffusion-based image super-resolution (SR) methods have achieved remarkable success by leveraging large pre-trained text-to-image diffusion models as priors.
We introduce a novel one-step SR model, which significantly addresses the efficiency issue of diffusion-based SR methods.
Unlike existing fine-tuning strategies, we designed a degradation-guided Low-Rank Adaptation (LoRA) module specifically for SR.
arXiv Detail & Related papers (2024-09-25T16:15:21Z) - One Step Diffusion-based Super-Resolution with Time-Aware Distillation [60.262651082672235]
Diffusion-based image super-resolution (SR) methods have shown promise in reconstructing high-resolution images with fine details from low-resolution counterparts.
Recent techniques have been devised to enhance the sampling efficiency of diffusion-based SR models via knowledge distillation.
We propose a time-aware diffusion distillation method, named TAD-SR, to accomplish effective and efficient image super-resolution.
arXiv Detail & Related papers (2024-08-14T11:47:22Z) - Large Kernel Distillation Network for Efficient Single Image Super-Resolution [8.094254341695684]
Single-image super-resolution (SISR) has achieved remarkable performance in recent years.
Current state-of-the-art (SOTA) models still face problems such as high computational costs.
We propose the Large Kernel Distillation Network (LKDN) in this paper.
arXiv Detail & Related papers (2024-07-19T14:21:56Z) - A-SDM: Accelerating Stable Diffusion through Redundancy Removal and
Performance Optimization [54.113083217869516]
In this work, we first explore the computational redundancy part of the network.
We then prune the redundancy blocks of the model and maintain the network performance.
Thirdly, we propose a global-regional interactive (GRI) attention to speed up the computationally intensive attention part.
arXiv Detail & Related papers (2023-12-24T15:37:47Z) - Towards Lightweight Super-Resolution with Dual Regression Learning [58.98801753555746]
Deep neural networks have exhibited remarkable performance in image super-resolution (SR) tasks.
The SR problem is typically an ill-posed problem and existing methods would come with several limitations.
We propose a dual regression learning scheme to reduce the space of possible SR mappings.
arXiv Detail & Related papers (2022-07-16T12:46:10Z) - Towards Practical Lipreading with Distilled and Efficient Models [57.41253104365274]
Lipreading has witnessed a lot of progress due to the resurgence of neural networks.
Recent works have placed emphasis on aspects such as improving performance by finding the optimal architecture or improving generalization.
There is still a significant gap between the current methodologies and the requirements for an effective deployment of lipreading in practical scenarios.
We propose a series of innovations that significantly bridge that gap: first, we raise the state-of-the-art performance by a wide margin on LRW and LRW-1000 to 88.5% and 46.6%, respectively using self-distillation.
arXiv Detail & Related papers (2020-07-13T16:56:27Z) - Characteristic Regularisation for Super-Resolving Face Images [81.84939112201377]
Existing facial image super-resolution (SR) methods focus mostly on improving artificially down-sampled low-resolution (LR) imagery.
Previous unsupervised domain adaptation (UDA) methods address this issue by training a model using unpaired genuine LR and HR data.
This renders the model overstretched with two tasks: consistifying the visual characteristics and enhancing the image resolution.
We formulate a method that joins the advantages of conventional SR and UDA models.
arXiv Detail & Related papers (2019-12-30T16:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.