SSP-IR: Semantic and Structure Priors for Diffusion-based Realistic Image Restoration
- URL: http://arxiv.org/abs/2407.03635v2
- Date: Thu, 13 Feb 2025 10:18:34 GMT
- Title: SSP-IR: Semantic and Structure Priors for Diffusion-based Realistic Image Restoration
- Authors: Yuhong Zhang, Hengsheng Zhang, Zhengxue Cheng, Rong Xie, Li Song, Wenjun Zhang,
- Abstract summary: SSP-IR aims to fully exploit semantic and structure priors from low-quality images.<n>Our method outperforms other state-of-the-art methods overall on both synthetic and real-world datasets.
- Score: 20.873676111265656
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Realistic image restoration is a crucial task in computer vision, and diffusion-based models for image restoration have garnered significant attention due to their ability to produce realistic results. Restoration can be seen as a controllable generation conditioning on priors. However, due to the severity of image degradation, existing diffusion-based restoration methods cannot fully exploit priors from low-quality images and still have many challenges in perceptual quality, semantic fidelity, and structure accuracy. Based on the challenges, we introduce a novel image restoration method, SSP-IR. Our approach aims to fully exploit semantic and structure priors from low-quality images to guide the diffusion model in generating semantically faithful and structurally accurate natural restoration results. Specifically, we integrate the visual comprehension capabilities of Multimodal Large Language Models (explicit) and the visual representations of the original image (implicit) to acquire accurate semantic prior. To extract degradation-independent structure prior, we introduce a Processor with RGB and FFT constraints to extract structure prior from the low-quality images, guiding the diffusion model and preventing the generation of unreasonable artifacts. Lastly, we employ a multi-level attention mechanism to integrate the acquired semantic and structure priors. The qualitative and quantitative results demonstrate that our method outperforms other state-of-the-art methods overall on both synthetic and real-world datasets. Our project page is https://zyhrainbow.github.io/projects/SSP-IR.
Related papers
- Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models [33.76031793753807]
We adapt the autoregressive multimodal model Lumina-mGPT into a robust Real-ISR model, namely PURE.
PURE Perceives and Understands the input low-quality image, then REstores its high-quality counterpart.
Experimental results demonstrate that PURE preserves image content while generating realistic details.
arXiv Detail & Related papers (2025-03-14T04:33:59Z) - MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling [64.09238330331195]
We propose a novel Multi-Modal Auto-Regressive (MMAR) probabilistic modeling framework.
Unlike discretization line of method, MMAR takes in continuous-valued image tokens to avoid information loss.
We show that MMAR demonstrates much more superior performance than other joint multi-modal models.
arXiv Detail & Related papers (2024-10-14T17:57:18Z) - Multi-Scale Representation Learning for Image Restoration with State-Space Model [13.622411683295686]
We propose a novel Multi-Scale State-Space Model-based (MS-Mamba) for efficient image restoration.
Our proposed method achieves new state-of-the-art performance while maintaining low computational complexity.
arXiv Detail & Related papers (2024-08-19T16:42:58Z) - Timestep-Aware Diffusion Model for Extreme Image Rescaling [47.89362819768323]
We propose a novel framework called Timestep-Aware Diffusion Model (TADM) for extreme image rescaling.
TADM performs rescaling operations in the latent space of a pre-trained autoencoder.
It effectively leverages powerful natural image priors learned by a pre-trained text-to-image diffusion model.
arXiv Detail & Related papers (2024-08-17T09:51:42Z) - Diff-Restorer: Unleashing Visual Prompts for Diffusion-based Universal Image Restoration [19.87693298262894]
We propose Diff-Restorer, a universal image restoration method based on the diffusion model.
We utilize the pre-trained visual language model to extract visual prompts from degraded images.
We also design a Degradation-aware Decoder to perform structural correction and convert the latent code to the pixel domain.
arXiv Detail & Related papers (2024-07-04T05:01:10Z) - DiffLoss: unleashing diffusion model as constraint for training image restoration network [4.8677910801584385]
We introduce a new perspective that implicitly leverages the diffusion model to assist the training of image restoration network, called DiffLoss.
By combining these two designs, the overall loss function is able to improve the perceptual quality of image restoration, resulting in visually pleasing and semantically enhanced outcomes.
arXiv Detail & Related papers (2024-06-27T09:33:24Z) - DaLPSR: Leverage Degradation-Aligned Language Prompt for Real-World Image Super-Resolution [19.33582308829547]
This paper proposes to leverage degradation-aligned language prompt for accurate, fine-grained, and high-fidelity image restoration.
The proposed method achieves a new state-of-the-art perceptual quality level.
arXiv Detail & Related papers (2024-06-24T09:30:36Z) - Robust CLIP-Based Detector for Exposing Diffusion Model-Generated Images [13.089550724738436]
Diffusion models (DMs) have revolutionized image generation, producing high-quality images with applications spanning various fields.
Their ability to create hyper-realistic images poses significant challenges in distinguishing between real and synthetic content.
This work introduces a robust detection framework that integrates image and text features extracted by CLIP model with a Multilayer Perceptron (MLP) classifier.
arXiv Detail & Related papers (2024-04-19T14:30:41Z) - Diffusion Model Based Visual Compensation Guidance and Visual Difference
Analysis for No-Reference Image Quality Assessment [82.13830107682232]
We propose a novel class of state-of-the-art (SOTA) generative model, which exhibits the capability to model intricate relationships.
We devise a new diffusion restoration network that leverages the produced enhanced image and noise-containing images.
Two visual evaluation branches are designed to comprehensively analyze the obtained high-level feature information.
arXiv Detail & Related papers (2024-02-22T09:39:46Z) - JoReS-Diff: Joint Retinex and Semantic Priors in Diffusion Model for Low-light Image Enhancement [69.6035373784027]
Low-light image enhancement (LLIE) has achieved promising performance by employing conditional diffusion models.
Previous methods may neglect the importance of a sufficient formulation of task-specific condition strategy.
We propose JoReS-Diff, a novel approach that incorporates Retinex- and semantic-based priors as the additional pre-processing condition.
arXiv Detail & Related papers (2023-12-20T08:05:57Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Diffusion Models for Image Restoration and Enhancement -- A
Comprehensive Survey [96.99328714941657]
We present a comprehensive review of recent diffusion model-based methods on image restoration.
We classify and emphasize the innovative designs using diffusion models for both IR and blind/real-world IR.
We propose five potential and challenging directions for the future research of diffusion model-based IR.
arXiv Detail & Related papers (2023-08-18T08:40:38Z) - Physics-Driven Turbulence Image Restoration with Stochastic Refinement [80.79900297089176]
Image distortion by atmospheric turbulence is a critical problem in long-range optical imaging systems.
Fast and physics-grounded simulation tools have been introduced to help the deep-learning models adapt to real-world turbulence conditions.
This paper proposes the Physics-integrated Restoration Network (PiRN) to help the network to disentangle theity from the degradation and the underlying image.
arXiv Detail & Related papers (2023-07-20T05:49:21Z) - PC-GANs: Progressive Compensation Generative Adversarial Networks for
Pan-sharpening [50.943080184828524]
We propose a novel two-step model for pan-sharpening that sharpens the MS image through the progressive compensation of the spatial and spectral information.
The whole model is composed of triple GANs, and based on the specific architecture, a joint compensation loss function is designed to enable the triple GANs to be trained simultaneously.
arXiv Detail & Related papers (2022-07-29T03:09:21Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - Spatially-Adaptive Image Restoration using Distortion-Guided Networks [51.89245800461537]
We present a learning-based solution for restoring images suffering from spatially-varying degradations.
We propose SPAIR, a network design that harnesses distortion-localization information and dynamically adjusts to difficult regions in the image.
arXiv Detail & Related papers (2021-08-19T11:02:25Z) - Towards Unsupervised Deep Image Enhancement with Generative Adversarial
Network [92.01145655155374]
We present an unsupervised image enhancement generative network (UEGAN)
It learns the corresponding image-to-image mapping from a set of images with desired characteristics in an unsupervised manner.
Results show that the proposed model effectively improves the aesthetic quality of images.
arXiv Detail & Related papers (2020-12-30T03:22:46Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.