Dual Prompting Image Restoration with Diffusion Transformers
- URL: http://arxiv.org/abs/2504.17825v1
- Date: Thu, 24 Apr 2025 02:34:44 GMT
- Title: Dual Prompting Image Restoration with Diffusion Transformers
- Authors: Dehong Kong, Fan Li, Zhixin Wang, Jiaqi Xu, Renjing Pei, Wenbo Li, WenQi Ren,
- Abstract summary: DPIR (Dual Prompting Image Restoration) is a novel image restoration method that effectivly extracts conditional information of low-quality images from multiple perspectives.<n>The extracted global-local visual prompts as extra conditional control, alongside textual prompts to form dual prompts, greatly enhance the quality of the restoration.
- Score: 45.159373436771
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent state-of-the-art image restoration methods mostly adopt latent diffusion models with U-Net backbones, yet still facing challenges in achieving high-quality restoration due to their limited capabilities. Diffusion transformers (DiTs), like SD3, are emerging as a promising alternative because of their better quality with scalability. In this paper, we introduce DPIR (Dual Prompting Image Restoration), a novel image restoration method that effectivly extracts conditional information of low-quality images from multiple perspectives. Specifically, DPIR consits of two branches: a low-quality image conditioning branch and a dual prompting control branch. The first branch utilizes a lightweight module to incorporate image priors into the DiT with high efficiency. More importantly, we believe that in image restoration, textual description alone cannot fully capture its rich visual characteristics. Therefore, a dual prompting module is designed to provide DiT with additional visual cues, capturing both global context and local appearance. The extracted global-local visual prompts as extra conditional control, alongside textual prompts to form dual prompts, greatly enhance the quality of the restoration. Extensive experimental results demonstrate that DPIR delivers superior image restoration performance.
Related papers
- UniRestore: Unified Perceptual and Task-Oriented Image Restoration Model Using Diffusion Prior [56.35236964617809]
Image restoration aims to recover content from inputs degraded by various factors, such as adverse weather, blur, and noise.
This paper introduces UniRestore, a unified image restoration model that bridges the gap between PIR and TIR.
We propose a Complementary Feature Restoration Module (CFRM) to reconstruct degraded encoder features and a Task Feature Adapter (TFA) module to facilitate adaptive feature fusion in the decoder.
arXiv Detail & Related papers (2025-01-22T08:06:48Z) - Improving Image Restoration through Removing Degradations in Textual
Representations [60.79045963573341]
We introduce a new perspective for improving image restoration by removing degradation in the textual representations of a degraded image.
To address the cross-modal assistance, we propose to map the degraded images into textual representations for removing the degradations.
In particular, We ingeniously embed an image-to-text mapper and text restoration module into CLIP-equipped text-to-image models to generate the guidance.
arXiv Detail & Related papers (2023-12-28T19:18:17Z) - SPIRE: Semantic Prompt-Driven Image Restoration [66.26165625929747]
We develop SPIRE, a Semantic and restoration Prompt-driven Image Restoration framework.
Our approach is the first framework that supports fine-level instruction through language-based quantitative specification of the restoration strength.
Our experiments demonstrate the superior restoration performance of SPIRE compared to the state of the arts.
arXiv Detail & Related papers (2023-12-18T17:02:30Z) - Prompt-In-Prompt Learning for Universal Image Restoration [38.81186629753392]
We propose novel Prompt-In-Prompt learning for universal image restoration, named PIP.
We present two novel prompts, a degradation-aware prompt to encode high-level degradation knowledge and a basic restoration prompt to provide essential low-level information.
By doing so, the resultant PIP works as a plug-and-play module to enhance existing restoration models for universal image restoration.
arXiv Detail & Related papers (2023-12-08T13:36:01Z) - Multimodal Prompt Perceiver: Empower Adaptiveness, Generalizability and Fidelity for All-in-One Image Restoration [58.11518043688793]
MPerceiver is a novel approach to enhance adaptiveness, generalizability and fidelity for all-in-one image restoration.
MPerceiver is trained on 9 tasks for all-in-one IR and outperforms state-of-the-art task-specific methods across most tasks.
arXiv Detail & Related papers (2023-12-05T17:47:11Z) - DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior [70.46245698746874]
We present DiffBIR, a general restoration pipeline that could handle different blind image restoration tasks.
DiffBIR decouples blind image restoration problem into two stages: 1) degradation removal: removing image-independent content; 2) information regeneration: generating the lost image content.
In the first stage, we use restoration modules to remove degradations and obtain high-fidelity restored results.
For the second stage, we propose IRControlNet that leverages the generative ability of latent diffusion models to generate realistic details.
arXiv Detail & Related papers (2023-08-29T07:11:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.