PocketSR: The Super-Resolution Expert in Your Pocket Mobiles
- URL: http://arxiv.org/abs/2510.03012v1
- Date: Fri, 03 Oct 2025 13:56:18 GMT
- Title: PocketSR: The Super-Resolution Expert in Your Pocket Mobiles
- Authors: Haoze Sun, Linfeng Jiang, Fan Li, Renjing Pei, Zhixin Wang, Yong Guo, Jiaqi Xu, Haoyu Chen, Jin Han, Fenglong Song, Yujiu Yang, Wenbo Li,
- Abstract summary: Real-world image super-resolution (RealSR) aims to enhance the visual quality of in-the-wild images, such as those captured by mobile phones.<n>Existing methods leveraging large generative models demonstrate impressive results, but the high computational cost and latency make them impractical for edge deployment.<n>We introduce PocketSR, an ultra-lightweight, single-step model that brings generative modeling capabilities to RealSR while maintaining high fidelity.
- Score: 69.26751136689533
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-world image super-resolution (RealSR) aims to enhance the visual quality of in-the-wild images, such as those captured by mobile phones. While existing methods leveraging large generative models demonstrate impressive results, the high computational cost and latency make them impractical for edge deployment. In this paper, we introduce PocketSR, an ultra-lightweight, single-step model that brings generative modeling capabilities to RealSR while maintaining high fidelity. To achieve this, we design LiteED, a highly efficient alternative to the original computationally intensive VAE in SD, reducing parameters by 97.5% while preserving high-quality encoding and decoding. Additionally, we propose online annealing pruning for the U-Net, which progressively shifts generative priors from heavy modules to lightweight counterparts, ensuring effective knowledge transfer and further optimizing efficiency. To mitigate the loss of prior knowledge during pruning, we incorporate a multi-layer feature distillation loss. Through an in-depth analysis of each design component, we provide valuable insights for future research. PocketSR, with a model size of 146M parameters, processes 4K images in just 0.8 seconds, achieving a remarkable speedup over previous methods. Notably, it delivers performance on par with state-of-the-art single-step and even multi-step RealSR models, making it a highly practical solution for edge-device applications.
Related papers
- Dual-domain Adaptation Networks for Realistic Image Super-resolution [81.34345637776408]
Realistic image super-resolution (SR) focuses on transforming real-world low-resolution (LR) images into high-resolution (HR) ones.<n>Current methods struggle with limited real-world LR-HR data, impacting the learning of basic image features.<n>We introduce a novel approach, which is able to efficiently adapt pre-trained image SR models from simulated to real-world datasets.
arXiv Detail & Related papers (2025-11-21T12:57:23Z) - TinySR: Pruning Diffusion for Real-World Image Super-Resolution [35.07163534857897]
We present TinySR, a compact yet effective diffusion model specifically designed for Real-ISR.<n>TinySR significantly reduces computational cost and model size, achieving up to 5.68x speedup and 83% parameter reduction compared to its teacher TSD-SR.
arXiv Detail & Related papers (2025-08-24T16:17:33Z) - MambaLiteSR: Image Super-Resolution with Low-Rank Mamba using Knowledge Distillation [0.5243460995467893]
MambaLiteSR is a novel lightweight image Super-Resolution (SR) model that utilizes the architecture of Vision Mamba.<n>We show that MambaLiteSR achieves performance comparable to both the baseline and other edge models while using 15% fewer parameters.<n>It also improves power consumption by up to 58% compared to state-of-the-art SR edge models, all while maintaining low energy use during training.
arXiv Detail & Related papers (2025-02-19T20:32:03Z) - Low-Resource Video Super-Resolution using Memory, Wavelets, and Deformable Convolutions [3.018928786249079]
Video super-resolution (VSR) remains a formidable challenge in its adoption for deployment on resource-constrained edge devices.<n>We propose a novel lightweight and parameter-efficient neural architecture for VSR that achieves state-of-the-art reconstruction accuracy with just 2.3 million parameters.
arXiv Detail & Related papers (2025-02-03T20:46:15Z) - Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient [52.96232442322824]
Collaborative Decoding (CoDe) is a novel efficient decoding strategy tailored for the Visual Auto-Regressive ( VAR) framework.<n>CoDe capitalizes on two critical observations: the substantially reduced parameter demands at larger scales and the exclusive generation patterns across different scales.<n>CoDe achieves a 1.7x speedup, slashes memory usage by around 50%, and preserves image quality with only a negligible FID increase from 1.95 to 1.98.
arXiv Detail & Related papers (2024-11-26T15:13:15Z) - Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR)
CFSR inherits the advantages of both convolution-based and transformer-based approaches.
Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z) - A-SDM: Accelerating Stable Diffusion through Redundancy Removal and
Performance Optimization [54.113083217869516]
In this work, we first explore the computational redundancy part of the network.
We then prune the redundancy blocks of the model and maintain the network performance.
Thirdly, we propose a global-regional interactive (GRI) attention to speed up the computationally intensive attention part.
arXiv Detail & Related papers (2023-12-24T15:37:47Z) - Generative Adversarial Super-Resolution at the Edge with Knowledge
Distillation [1.3764085113103222]
Single-Image Super-Resolution can support robotic tasks in environments where a reliable visual stream is required.
We propose an efficient Generative Adversarial Network model for real-time Super-Resolution, called EdgeSRGAN.
arXiv Detail & Related papers (2022-09-07T10:58:41Z) - Hybrid Pixel-Unshuffled Network for Lightweight Image Super-Resolution [64.54162195322246]
Convolutional neural network (CNN) has achieved great success on image super-resolution (SR)
Most deep CNN-based SR models take massive computations to obtain high performance.
We propose a novel Hybrid Pixel-Unshuffled Network (HPUN) by introducing an efficient and effective downsampling module into the SR task.
arXiv Detail & Related papers (2022-03-16T20:10:41Z) - Extremely Lightweight Quantization Robust Real-Time Single-Image Super
Resolution for Mobile Devices [0.0]
Single-Image Super Resolution (SISR) is a classical computer vision problem and it has been studied for over decades.
Recent work on SISR focuses solutions with deep learning methodologies and achieves state-of-the-art results.
We propose a hardware (Synaptics Dolphin NPU) aware, extremely lightweight quantization robust real-time super resolution network (XLSR)
arXiv Detail & Related papers (2021-05-21T11:29:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.