MoE-DiffIR: Task-customized Diffusion Priors for Universal Compressed Image Restoration
- URL: http://arxiv.org/abs/2407.10833v1
- Date: Mon, 15 Jul 2024 15:43:27 GMT
- Title: MoE-DiffIR: Task-customized Diffusion Priors for Universal Compressed Image Restoration
- Authors: Yulin Ren, Xin Li, Bingchen Li, Xingrui Wang, Mengxi Guo, Shijie Zhao, Li Zhang, Zhibo Chen,
- Abstract summary: MoE-DiffIR is an innovative universal compressed image restoration (CIR) method with task-customized diffusion priors.
MoE-DiffIR develops the powerful mixture-of-experts (MoE) prompt module.
The degradation-aware routing mechanism is proposed to enable the flexible assignment of basic prompts.
- Score: 16.482022642533448
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present MoE-DiffIR, an innovative universal compressed image restoration (CIR) method with task-customized diffusion priors. This intends to handle two pivotal challenges in the existing CIR methods: (i) lacking adaptability and universality for different image codecs, e.g., JPEG and WebP; (ii) poor texture generation capability, particularly at low bitrates. Specifically, our MoE-DiffIR develops the powerful mixture-of-experts (MoE) prompt module, where some basic prompts cooperate to excavate the task-customized diffusion priors from Stable Diffusion (SD) for each compression task. Moreover, the degradation-aware routing mechanism is proposed to enable the flexible assignment of basic prompts. To activate and reuse the cross-modality generation prior of SD, we design the visual-to-text adapter for MoE-DiffIR, which aims to adapt the embedding of low-quality images from the visual domain to the textual domain as the textual guidance for SD, enabling more consistent and reasonable texture generation. We also construct one comprehensive benchmark dataset for universal CIR, covering 21 types of degradations from 7 popular traditional and learned codecs. Extensive experiments on universal CIR have demonstrated the excellent robustness and texture restoration capability of our proposed MoE-DiffIR. The project can be found at https://renyulin-f.github.io/MoE-DiffIR.github.io/.
Related papers
- UniRestore: Unified Perceptual and Task-Oriented Image Restoration Model Using Diffusion Prior [56.35236964617809]
Image restoration aims to recover content from inputs degraded by various factors, such as adverse weather, blur, and noise.
This paper introduces UniRestore, a unified image restoration model that bridges the gap between PIR and TIR.
We propose a Complementary Feature Restoration Module (CFRM) to reconstruct degraded encoder features and a Task Feature Adapter (TFA) module to facilitate adaptive feature fusion in the decoder.
arXiv Detail & Related papers (2025-01-22T08:06:48Z) - Adversarial Diffusion Compression for Real-World Image Super-Resolution [16.496532580598007]
Real-world image super-resolution aims to reconstruct high-resolution images from low-resolution inputs degraded by complex processes.
One-step diffusion networks like OSEDiff and S3Diff alleviate this issue but still incur high computational costs.
This paper proposes a novel Real-ISR method, AdcSR, by distilling the one-step diffusion network OSEDiff into a streamlined diffusion-GAN model.
arXiv Detail & Related papers (2024-11-20T15:13:36Z) - UCIP: A Universal Framework for Compressed Image Super-Resolution using Dynamic Prompt [28.67147892614428]
Compressed Image Super-resolution (CSR) aims to simultaneously super-resolve the compressed images and tackle the challenging hybrid distortions caused by compression.
We propose the first universal CSR framework, dubbed UCIP, with dynamic prompt learning.
Experiments have shown the consistent and excellent performance of our UCIP on universal CSR tasks.
arXiv Detail & Related papers (2024-07-18T02:36:39Z) - PromptCIR: Blind Compressed Image Restoration with Prompt Learning [19.06110655450585]
We propose a prompt-learning-based compressed image restoration network, dubbed PromptCIR.
PromptCIR exploits prompts to encode compression information implicitly, where prompts interact with soft weights generated from image features.
PromptCIR wins first place in the NTIRE 2024 challenge of blind compressed image enhancement track.
arXiv Detail & Related papers (2024-04-26T14:20:31Z) - Unified-Width Adaptive Dynamic Network for All-In-One Image Restoration [50.81374327480445]
We introduce a novel concept positing that intricate image degradation can be represented in terms of elementary degradation.
We propose the Unified-Width Adaptive Dynamic Network (U-WADN), consisting of two pivotal components: a Width Adaptive Backbone (WAB) and a Width Selector (WS)
The proposed U-WADN achieves better performance while simultaneously reducing up to 32.3% of FLOPs and providing approximately 15.7% real-time acceleration.
arXiv Detail & Related papers (2024-01-24T04:25:12Z) - Multimodal Prompt Perceiver: Empower Adaptiveness, Generalizability and Fidelity for All-in-One Image Restoration [58.11518043688793]
MPerceiver is a novel approach to enhance adaptiveness, generalizability and fidelity for all-in-one image restoration.
MPerceiver is trained on 9 tasks for all-in-one IR and outperforms state-of-the-art task-specific methods across most tasks.
arXiv Detail & Related papers (2023-12-05T17:47:11Z) - DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior [70.46245698746874]
We present DiffBIR, a general restoration pipeline that could handle different blind image restoration tasks.
DiffBIR decouples blind image restoration problem into two stages: 1) degradation removal: removing image-independent content; 2) information regeneration: generating the lost image content.
In the first stage, we use restoration modules to remove degradations and obtain high-fidelity restored results.
For the second stage, we propose IRControlNet that leverages the generative ability of latent diffusion models to generate realistic details.
arXiv Detail & Related papers (2023-08-29T07:11:52Z) - You Can Mask More For Extremely Low-Bitrate Image Compression [80.7692466922499]
Learned image compression (LIC) methods have experienced significant progress during recent years.
LIC methods fail to explicitly explore the image structure and texture components crucial for image compression.
We present DA-Mask that samples visible patches based on the structure and texture of original images.
We propose a simple yet effective masked compression model (MCM), the first framework that unifies LIC and LIC end-to-end for extremely low-bitrate compression.
arXiv Detail & Related papers (2023-06-27T15:36:22Z) - The Devil Is in the Details: Window-based Attention for Image
Compression [58.1577742463617]
Most existing learned image compression models are based on Convolutional Neural Networks (CNNs)
In this paper, we study the effects of multiple kinds of attention mechanisms for local features learning, then introduce a more straightforward yet effective window-based local attention block.
The proposed window-based attention is very flexible which could work as a plug-and-play component to enhance CNN and Transformer models.
arXiv Detail & Related papers (2022-03-16T07:55:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.