TSFormer: A Robust Framework for Efficient UHD Image Restoration
- URL: http://arxiv.org/abs/2411.10951v2
- Date: Tue, 19 Nov 2024 23:09:25 GMT
- Title: TSFormer: A Robust Framework for Efficient UHD Image Restoration
- Authors: Xin Su, Chen Wu, Zhuoran Zheng,
- Abstract summary: TSFormer is an all-in-one framework that integrates textbfTrusted learning with textbfSparsification.
Our model can run a 4K image in real time (40fps) with 3.38 M parameters.
- Score: 7.487270862599671
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ultra-high-definition (UHD) image restoration is vital for applications demanding exceptional visual fidelity, yet existing methods often face a trade-off between restoration quality and efficiency, limiting their practical deployment. In this paper, we propose TSFormer, an all-in-one framework that integrates \textbf{T}rusted learning with \textbf{S}parsification to boost both generalization capability and computational efficiency in UHD image restoration. The key is that only a small amount of token movement is allowed within the model. To efficiently filter tokens, we use Min-$p$ with random matrix theory to quantify the uncertainty of tokens, thereby improving the robustness of the model. Our model can run a 4K image in real time (40fps) with 3.38 M parameters. Extensive experiments demonstrate that TSFormer achieves state-of-the-art restoration quality while enhancing generalization and reducing computational demands. In addition, our token filtering method can be applied to other image restoration models to effectively accelerate inference and maintain performance.
Related papers
- UHD Image Dehazing via anDehazeFormer with Atmospheric-aware KV Cache [22.67146255766633]
We propose an efficient visual transformer framework for ultra-high-definition (UHD) image dehazing.<n>The proposed architecture improves the training convergence speed by textbf5 $times$ while reducing memory overhead.<n>Our approach maintains state-of-the-art dehazing quality while significantly improving computational efficiency for 4K/8K image restoration tasks.
arXiv Detail & Related papers (2025-05-20T07:04:34Z) - Ultra Lowrate Image Compression with Semantic Residual Coding and Compression-aware Diffusion [28.61304513668606]
ResULIC is a residual-guided ultra lowrate image compression system.<n>It incorporates residual signals into both semantic retrieval and the diffusion-based generation process.<n>It achieves superior objective and subjective performance compared to state-of-the-art diffusion-based methods.
arXiv Detail & Related papers (2025-05-13T06:51:23Z) - ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration [75.0053551643052]
We introduce ZipIR, a novel framework that enhances efficiency, scalability, and long-range modeling for high-res image restoration.
ZipIR employs a highly compressed latent representation that compresses image 32x, effectively reducing the number of spatial tokens.
ZipIR surpasses existing diffusion-based methods, offering unmatched speed and quality in restoring high-resolution images from severely degraded inputs.
arXiv Detail & Related papers (2025-04-11T14:49:52Z) - MambaIC: State Space Models for High-Performance Learned Image Compression [53.991726013454695]
A high-performance image compression algorithm is crucial for real-time information transmission across numerous fields.
Inspired by the effectiveness of state space models (SSMs) in capturing long-range dependencies, we leverage SSMs to address computational inefficiency in existing methods.
We propose an enhanced image compression approach through refined context modeling, which we term MambaIC.
arXiv Detail & Related papers (2025-03-16T11:32:34Z) - Striving for Faster and Better: A One-Layer Architecture with Auto Re-parameterization for Low-Light Image Enhancement [50.93686436282772]
We aim to delve into the limits of image enhancers both from visual quality and computational efficiency.
By rethinking the task demands, we build an explicit connection, i.e., visual quality and computational efficiency are corresponding to model learning and structure design.
Ultimately, this achieves efficient low-light image enhancement using only a single convolutional layer, while maintaining excellent visual quality.
arXiv Detail & Related papers (2025-02-27T08:20:03Z) - Directing Mamba to Complex Textures: An Efficient Texture-Aware State Space Model for Image Restoration [75.51789992466183]
TAMambaIR simultaneously perceives image textures achieves and a trade-off between performance and efficiency.
Extensive experiments on benchmarks for image super-resolution, deraining, and low-light image enhancement demonstrate that TAMambaIR achieves state-of-the-art performance with significantly improved efficiency.
arXiv Detail & Related papers (2025-01-27T23:53:49Z) - Assessing UHD Image Quality from Aesthetics, Distortions, and Saliency [51.36674160287799]
We design a multi-branch deep neural network (DNN) to assess the quality of UHD images from three perspectives.
aesthetic features are extracted from low-resolution images downsampled from the UHD ones.
Technical distortions are measured using a fragment image composed of mini-patches cropped from UHD images.
The salient content of UHD images is detected and cropped to extract quality-aware features from the salient regions.
arXiv Detail & Related papers (2024-09-01T15:26:11Z) - Review Learning: Advancing All-in-One Ultra-High-Definition Image Restoration Training Method [7.487270862599671]
We propose a new training paradigm for general image restoration models, which we name bfReview Learning.
This approach begins with sequential training of an image restoration model on several degraded datasets, combined with a review mechanism.
We design a lightweight all-purpose image restoration network that can efficiently reason about degraded images with 4K resolution on a single consumer-grade GPU.
arXiv Detail & Related papers (2024-08-13T08:08:45Z) - Efficient Degradation-aware Any Image Restoration [83.92870105933679]
We propose textitDaAIR, an efficient All-in-One image restorer employing a Degradation-aware Learner (DaLe) in the low-rank regime.
By dynamically allocating model capacity to input degradations, we realize an efficient restorer integrating holistic and specific learning.
arXiv Detail & Related papers (2024-05-24T11:53:27Z) - Image Inpainting via Tractable Steering of Diffusion Models [54.13818673257381]
This paper proposes to exploit the ability of Tractable Probabilistic Models (TPMs) to exactly and efficiently compute the constrained posterior.
Specifically, this paper adopts a class of expressive TPMs termed Probabilistic Circuits (PCs)
We show that our approach can consistently improve the overall quality and semantic coherence of inpainted images with only 10% additional computational overhead.
arXiv Detail & Related papers (2023-11-28T21:14:02Z) - HAT: Hybrid Attention Transformer for Image Restoration [61.74223315807691]
Transformer-based methods have shown impressive performance in image restoration tasks, such as image super-resolution and denoising.
We propose a new Hybrid Attention Transformer (HAT) to activate more input pixels for better restoration.
Our HAT achieves state-of-the-art performance both quantitatively and qualitatively.
arXiv Detail & Related papers (2023-09-11T05:17:55Z) - MOFA: A Model Simplification Roadmap for Image Restoration on Mobile
Devices [17.54747506334433]
We propose a roadmap that can be applied to further accelerate image restoration models prior to deployment.
Our approach decreases runtime by up to 13% and reduces the number of parameters by up to 23%, while increasing PSNR and SSIM.
arXiv Detail & Related papers (2023-08-24T01:29:15Z) - Refusion: Enabling Large-Size Realistic Image Restoration with
Latent-Space Diffusion Models [9.245782611878752]
We enhance the diffusion model in several aspects such as network architecture, noise level, denoising steps, training image size, and perceptual/scheduler scores.
We also propose a U-Net based latent diffusion model which performs diffusion in a low-resolution latent space while preserving high-resolution information from the original input for the decoding process.
These modifications allow us to apply diffusion models to various image restoration tasks, including real-world shadow removal, HR non-homogeneous dehazing, stereo super-resolution, and bokeh effect transformation.
arXiv Detail & Related papers (2023-04-17T14:06:49Z) - Attentive Fine-Grained Structured Sparsity for Image Restoration [63.35887911506264]
N:M structured pruning has appeared as one of the effective and practical pruning approaches for making the model efficient with the accuracy constraint.
We propose a novel pruning method that determines the pruning ratio for N:M structured sparsity at each layer.
arXiv Detail & Related papers (2022-04-26T12:44:55Z) - Identity Preserving Loss for Learned Image Compression [0.0]
This work proposes an end-to-end image compression framework that learns domain-specific features to achieve higher compression ratios.
We present a novel Identity Preserving Reconstruction (IPR) loss function which achieves Bits-Per-Pixel (BPP) values that are 38% and 42% of CRF-23 HEVC compression.
We show at-par recognition performance on the LFW dataset with an unseen recognition model while retaining a lower BPP value of 38% of CRF-23 HEVC compression.
arXiv Detail & Related papers (2022-04-22T18:01:01Z) - PhotoWCT$^2$: Compact Autoencoder for Photorealistic Style Transfer
Resulting from Blockwise Training and Skip Connections of High-Frequency
Residuals [35.64625206673256]
Photorealistic style transfer is an image editing task with the goal to modify an image to match the style of another image while ensuring the result looks like a real photograph.
A limitation of existing models is that they have many parameters, which in turn prevents their use for larger image resolutions and leads to slower run-times.
We introduce two mechanisms that enable our design of a more compact model that preserves state-of-art stylization strength and photorealism.
arXiv Detail & Related papers (2021-10-22T18:20:41Z) - The Power of Triply Complementary Priors for Image Compressive Sensing [89.14144796591685]
We propose a joint low-rank deep (LRD) image model, which contains a pair of complementaryly trip priors.
We then propose a novel hybrid plug-and-play framework based on the LRD model for image CS.
To make the optimization tractable, a simple yet effective algorithm is proposed to solve the proposed H-based image CS problem.
arXiv Detail & Related papers (2020-05-16T08:17:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.