Selective Hourglass Mapping for Universal Image Restoration Based on Diffusion Model
- URL: http://arxiv.org/abs/2403.11157v1
- Date: Sun, 17 Mar 2024 09:41:20 GMT
- Title: Selective Hourglass Mapping for Universal Image Restoration Based on Diffusion Model
- Authors: Dian Zheng, Xiao-Ming Wu, Shuzhou Yang, Jian Zhang, Jian-Fang Hu, Wei-Shi Zheng,
- Abstract summary: We propose an advanced selective hourglass mapping strategy based on diffusion model, DiffUIR.
We achieve state-of-the-art performance on five image restoration tasks, 22 benchmarks in the universal setting and zero-shot generalization setting.
- Score: 36.57703763466984
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Universal image restoration is a practical and potential computer vision task for real-world applications. The main challenge of this task is handling the different degradation distributions at once. Existing methods mainly utilize task-specific conditions (e.g., prompt) to guide the model to learn different distributions separately, named multi-partite mapping. However, it is not suitable for universal model learning as it ignores the shared information between different tasks. In this work, we propose an advanced selective hourglass mapping strategy based on diffusion model, termed DiffUIR. Two novel considerations make our DiffUIR non-trivial. Firstly, we equip the model with strong condition guidance to obtain accurate generation direction of diffusion model (selective). More importantly, DiffUIR integrates a flexible shared distribution term (SDT) into the diffusion algorithm elegantly and naturally, which gradually maps different distributions into a shared one. In the reverse process, combined with SDT and strong condition guidance, DiffUIR iteratively guides the shared distribution to the task-specific distribution with high image quality (hourglass). Without bells and whistles, by only modifying the mapping strategy, we achieve state-of-the-art performance on five image restoration tasks, 22 benchmarks in the universal setting and zero-shot generalization setting. Surprisingly, by only using a lightweight model (only 0.89M), we could achieve outstanding performance. The source code and pre-trained models are available at https://github.com/iSEE-Laboratory/DiffUIR
Related papers
- OneDiff: A Generalist Model for Image Difference Captioning [5.71214984158106]
Image Difference Captioning (IDC) is crucial for accurately describing variations between closely related images.
OneDiff is a novel generalist approach that utilizes a robust vision-language model architecture.
OneDiff consistently outperforms existing state-of-the-art models in accuracy and adaptability.
arXiv Detail & Related papers (2024-07-08T06:14:37Z) - FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models [56.71672127740099]
We focus on the task of image segmentation, which is traditionally solved by training models on closed-vocabulary datasets.
We leverage different and relatively small-sized, open-source foundation models for zero-shot open-vocabulary segmentation.
Our approach (dubbed FreeSeg-Diff), which does not rely on any training, outperforms many training-based approaches on both Pascal VOC and COCO datasets.
arXiv Detail & Related papers (2024-03-29T10:38:25Z) - ZoDi: Zero-Shot Domain Adaptation with Diffusion-Based Image Transfer [13.956618446530559]
This paper proposes a zero-shot domain adaptation method based on diffusion models, called ZoDi.
First, we utilize an off-the-shelf diffusion model to synthesize target-like images by transferring the domain of source images to the target domain.
Secondly, we train the model using both source images and synthesized images with the original representations to learn domain-robust representations.
arXiv Detail & Related papers (2024-03-20T14:58:09Z) - Denoising Diffusion Bridge Models [54.87947768074036]
Diffusion models are powerful generative models that map noise to data using processes.
For many applications such as image editing, the model input comes from a distribution that is not random noise.
In our work, we propose Denoising Diffusion Bridge Models (DDBMs)
arXiv Detail & Related papers (2023-09-29T03:24:24Z) - Your Diffusion Model is Secretly a Zero-Shot Classifier [90.40799216880342]
We show that density estimates from large-scale text-to-image diffusion models can be leveraged to perform zero-shot classification.
Our generative approach to classification attains strong results on a variety of benchmarks.
Our results are a step toward using generative over discriminative models for downstream tasks.
arXiv Detail & Related papers (2023-03-28T17:59:56Z) - One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale [36.590918776922905]
This paper proposes a unified diffusion framework (dubbed UniDiffuser) to fit all distributions relevant to a set of multi-modal data in one model.
Inspired by the unified view, UniDiffuser learns all distributions simultaneously with a minimal modification to the original diffusion model.
arXiv Detail & Related papers (2023-03-12T03:38:39Z) - Unite and Conquer: Plug & Play Multi-Modal Synthesis using Diffusion
Models [54.1843419649895]
We propose a solution based on denoising diffusion probabilistic models (DDPMs)
Our motivation for choosing diffusion models over other generative models comes from the flexible internal structure of diffusion models.
Our method can unite multiple diffusion models trained on multiple sub-tasks and conquer the combined task.
arXiv Detail & Related papers (2022-12-01T18:59:55Z) - Image Generation with Multimodal Priors using Denoising Diffusion
Probabilistic Models [54.1843419649895]
A major challenge in using generative models to accomplish this task is the lack of paired data containing all modalities and corresponding outputs.
We propose a solution based on a denoising diffusion probabilistic synthesis models to generate images under multi-model priors.
arXiv Detail & Related papers (2022-06-10T12:23:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.