Related papers: Training-Free Large Model Priors for Multiple-in-One Image Restoration

Training-Free Large Model Priors for Multiple-in-One Image Restoration

URL: http://arxiv.org/abs/2407.13181v1
Date: Thu, 18 Jul 2024 05:40:32 GMT
Title: Training-Free Large Model Priors for Multiple-in-One Image Restoration
Authors: Xuanhua He, Lang Li, Yingying Wang, Hui Zheng, Ke Cao, Keyu Yan, Rui Li, Chengjun Xie, Jie Zhang, Man Zhou,
Abstract summary: Large Model Driven Image Restoration framework (LMDIR) Our architecture comprises a query-based prompt encoder, degradation-aware transformer block injecting global degradation knowledge. This design facilitates single-stage training paradigm to address various degradations while supporting both automatic and user-guided restoration.
Score: 24.230376300759573
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Image restoration aims to reconstruct the latent clear images from their degraded versions. Despite the notable achievement, existing methods predominantly focus on handling specific degradation types and thus require specialized models, impeding real-world applications in dynamic degradation scenarios. To address this issue, we propose Large Model Driven Image Restoration framework (LMDIR), a novel multiple-in-one image restoration paradigm that leverages the generic priors from large multi-modal language models (MMLMs) and the pretrained diffusion models. In detail, LMDIR integrates three key prior knowledges: 1) global degradation knowledge from MMLMs, 2) scene-aware contextual descriptions generated by MMLMs, and 3) fine-grained high-quality reference images synthesized by diffusion models guided by MMLM descriptions. Standing on above priors, our architecture comprises a query-based prompt encoder, degradation-aware transformer block injecting global degradation knowledge, content-aware transformer block incorporating scene description, and reference-based transformer block incorporating fine-grained image priors. This design facilitates single-stage training paradigm to address various degradations while supporting both automatic and user-guided restoration. Extensive experiments demonstrate that our designed method outperforms state-of-the-art competitors on multiple evaluation benchmarks.

Related papers

RobustGS: Unified Boosting of Feedforward 3D Gaussian Splatting under Low-Quality Conditions [67.48495052903534]
We propose a general and efficient multi-view feature enhancement module, RobustGS.<n>It substantially improves the robustness of feedforward 3DGS methods under various adverse imaging conditions.<n>The RobustGS module can be seamlessly integrated into existing pretrained pipelines in a plug-and-play manner.
arXiv Detail & Related papers (2025-08-05T04:50:29Z)
UniLDiff: Unlocking the Power of Diffusion Priors for All-in-One Image Restoration [16.493990086330985]
UniLDiff is a unified framework enhanced with degradation- and detail-aware mechanisms.<n>We introduce a Degradation-Aware Feature Fusion (DAFF) to dynamically inject low-quality features into each denoising step.<n>We also design a Detail-Aware Expert Module (DAEM) in the decoder to enhance texture and fine-structure recovery.
arXiv Detail & Related papers (2025-07-31T16:02:00Z)
DPMambaIR:All-in-One Image Restoration via Degradation-Aware Prompt State Space Model [36.979833523678614]
All-in-One image restoration aims to address multiple image degradation problems. Existing approaches rely on Degradation-specific models or coarse-grained degradation prompts to guide image restoration. We propose DPMambaIR, a novel All-in-One image restoration framework.
arXiv Detail & Related papers (2025-04-24T16:46:32Z)
CoLLM: A Large Language Model for Composed Image Retrieval [76.29725148964368]
Composed Image Retrieval (CIR) is a complex task that aims to retrieve images based on a multimodal query. We present CoLLM, a one-stop framework that generates triplets on-the-fly from image-caption pairs. We leverage Large Language Models (LLMs) to generate joint embeddings of reference images and modification texts.
arXiv Detail & Related papers (2025-03-25T17:59:50Z)
Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models [33.76031793753807]
We adapt the autoregressive multimodal model Lumina-mGPT into a robust Real-ISR model, namely PURE. PURE Perceives and Understands the input low-quality image, then REstores its high-quality counterpart. Experimental results demonstrate that PURE preserves image content while generating realistic details.
arXiv Detail & Related papers (2025-03-14T04:33:59Z)
A Progressive Image Restoration Network for High-order Degradation Imaging in Remote Sensing [5.6223397629993626]
We propose a novel progressive restoration network for high-order degradation imaging (HDI-PRNet) Our method achieves superior performance on both synthetic and real remote sensing images.
arXiv Detail & Related papers (2024-12-10T05:08:39Z)
FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration [66.61201445650323]
Existing methods suffer from a generalization bottleneck in real-world scenarios. We contribute a million-scale dataset with two notable advantages over existing training data. We propose a robust model, FoundIR, to better address a broader range of restoration tasks in real-world scenarios.
arXiv Detail & Related papers (2024-12-02T12:08:40Z)
Mixed Degradation Image Restoration via Local Dynamic Optimization and Conditional Embedding [67.57487747508179]
Multiple-in-one image restoration (IR) has made significant progress, aiming to handle all types of single degraded image restoration with a single model. In this paper, we propose a novel multiple-in-one IR model that can effectively restore images with both single and mixed degradations.
arXiv Detail & Related papers (2024-11-25T09:26:34Z)
UIR-LoRA: Achieving Universal Image Restoration through Multiple Low-Rank Adaptation [50.27688690379488]
Existing unified methods treat multi-degradation image restoration as a multi-task learning problem. We propose a universal image restoration framework based on multiple low-rank adapters (LoRA) from multi-domain transfer learning. Our framework leverages the pre-trained generative model as the shared component for multi-degradation restoration and transfers it to specific degradation image restoration tasks.
arXiv Detail & Related papers (2024-09-30T11:16:56Z)
Multi-Scale Representation Learning for Image Restoration with State-Space Model [13.622411683295686]
We propose a novel Multi-Scale State-Space Model-based (MS-Mamba) for efficient image restoration. Our proposed method achieves new state-of-the-art performance while maintaining low computational complexity.
arXiv Detail & Related papers (2024-08-19T16:42:58Z)
Review Learning: Advancing All-in-One Ultra-High-Definition Image Restoration Training Method [7.487270862599671]
We propose a new training paradigm for general image restoration models, which we name bfReview Learning. This approach begins with sequential training of an image restoration model on several degraded datasets, combined with a review mechanism. We design a lightweight all-purpose image restoration network that can efficiently reason about degraded images with 4K resolution on a single consumer-grade GPU.
arXiv Detail & Related papers (2024-08-13T08:08:45Z)
Diff-Restorer: Unleashing Visual Prompts for Diffusion-based Universal Image Restoration [19.87693298262894]
We propose Diff-Restorer, a universal image restoration method based on the diffusion model. We utilize the pre-trained visual language model to extract visual prompts from degraded images. We also design a Degradation-aware Decoder to perform structural correction and convert the latent code to the pixel domain.
arXiv Detail & Related papers (2024-07-04T05:01:10Z)
Many-to-many Image Generation with Auto-regressive Diffusion Models [59.5041405824704]
This paper introduces a domain-general framework for many-to-many image generation, capable of producing interrelated image series from a given set of images. We present MIS, a novel large-scale multi-image dataset, containing 12M synthetic multi-image samples, each with 25 interconnected images. We learn M2M, an autoregressive model for many-to-many generation, where each image is modeled within a diffusion framework.
arXiv Detail & Related papers (2024-04-03T23:20:40Z)
Multimodal Prompt Perceiver: Empower Adaptiveness, Generalizability and Fidelity for All-in-One Image Restoration [58.11518043688793]
MPerceiver is a novel approach to enhance adaptiveness, generalizability and fidelity for all-in-one image restoration. MPerceiver is trained on 9 tasks for all-in-one IR and outperforms state-of-the-art task-specific methods across most tasks.
arXiv Detail & Related papers (2023-12-05T17:47:11Z)
Prompt-based Ingredient-Oriented All-in-One Image Restoration [0.0]
We propose a novel data ingredient-oriented approach to tackle multiple image degradation tasks. Specifically, we utilize a encoder to capture features and introduce prompts with degradation-specific information to guide the decoder. Our method performs competitively to the state-of-the-art.
arXiv Detail & Related papers (2023-09-06T15:05:04Z)
PromptIR: Prompting for All-in-One Blind Image Restoration [64.02374293256001]
We present a prompt-based learning approach, PromptIR, for All-In-One image restoration. Our method uses prompts to encode degradation-specific information, which is then used to dynamically guide the restoration network. PromptIR offers a generic and efficient plugin module with few lightweight prompts.
arXiv Detail & Related papers (2023-06-22T17:59:52Z)
Multi-Stage Progressive Image Restoration [167.6852235432918]
We propose a novel synergistic design that can optimally balance these competing goals. Our main proposal is a multi-stage architecture, that progressively learns restoration functions for the degraded inputs. The resulting tightly interlinked multi-stage architecture, named as MPRNet, delivers strong performance gains on ten datasets.
arXiv Detail & Related papers (2021-02-04T18:57:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.