Textual Prompt Guided Image Restoration
- URL: http://arxiv.org/abs/2312.06162v1
- Date: Mon, 11 Dec 2023 06:56:41 GMT
- Title: Textual Prompt Guided Image Restoration
- Authors: Qiuhai Yan and Aiwen Jiang and Kang Chen and Long Peng and Qiaosi Yi
and Chunjie Zhang
- Abstract summary: "All-in-one" models that can do blind image restoration have been concerned in recent years.
Recent works focus on learning visual prompts from data distribution to identify degradation type.
In this paper, an effective textual prompt guided image restoration model has been proposed.
- Score: 18.78902053873706
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image restoration has always been a cutting-edge topic in the academic and
industrial fields of computer vision. Since degradation signals are often
random and diverse, "all-in-one" models that can do blind image restoration
have been concerned in recent years. Early works require training specialized
headers and tails to handle each degradation of concern, which are manually
cumbersome. Recent works focus on learning visual prompts from data
distribution to identify degradation type. However, the prompts employed in
most of models are non-text, lacking sufficient emphasis on the importance of
human-in-the-loop. In this paper, an effective textual prompt guided image
restoration model has been proposed. In this model, task-specific BERT is
fine-tuned to accurately understand user's instructions and generating textual
prompt guidance. Depth-wise multi-head transposed attentions and gated
convolution modules are designed to bridge the gap between textual prompts and
visual features. The proposed model has innovatively introduced semantic
prompts into low-level visual domain. It highlights the potential to provide a
natural, precise, and controllable way to perform image restoration tasks.
Extensive experiments have been done on public denoising, dehazing and
deraining datasets. The experiment results demonstrate that, compared with
popular state-of-the-art methods, the proposed model can obtain much more
superior performance, achieving accurate recognition and removal of degradation
without increasing model's complexity. Related source codes and data will be
publicly available on github site
https://github.com/MoTong-AI-studio/TextPromptIR.
Related papers
- Visual Style Prompt Learning Using Diffusion Models for Blind Face Restoration [16.67947885664477]
Blind face restoration aims to recover high-quality facial images from various unidentified sources of degradation.
Prior knowledge-based methods, leveraging geometric priors and facial features, have led to advancements in face restoration but often fall short of capturing fine details.
We introduce a visual style prompt learning framework that utilizes diffusion probabilistic models to explicitly generate visual prompts.
arXiv Detail & Related papers (2024-12-30T16:05:40Z) - Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation [70.95783968368124]
We introduce a novel multi-modal autoregressive model, dubbed $textbfInstaManip$.
We propose an innovative group self-attention mechanism to break down the in-context learning process into two separate stages.
Our method surpasses previous few-shot image manipulation models by a notable margin.
arXiv Detail & Related papers (2024-12-02T01:19:21Z) - DaLPSR: Leverage Degradation-Aligned Language Prompt for Real-World Image Super-Resolution [19.33582308829547]
This paper proposes to leverage degradation-aligned language prompt for accurate, fine-grained, and high-fidelity image restoration.
The proposed method achieves a new state-of-the-art perceptual quality level.
arXiv Detail & Related papers (2024-06-24T09:30:36Z) - InstructIR: High-Quality Image Restoration Following Human Instructions [61.1546287323136]
We present the first approach that uses human-written instructions to guide the image restoration model.
Our method, InstructIR, achieves state-of-the-art results on several restoration tasks.
arXiv Detail & Related papers (2024-01-29T18:53:33Z) - Image Captions are Natural Prompts for Text-to-Image Models [70.30915140413383]
We analyze the relationship between the training effect of synthetic data and the synthetic data distribution induced by prompts.
We propose a simple yet effective method that prompts text-to-image generative models to synthesize more informative and diverse training data.
Our method significantly improves the performance of models trained on synthetic training data.
arXiv Detail & Related papers (2023-07-17T14:38:11Z) - PromptIR: Prompting for All-in-One Blind Image Restoration [64.02374293256001]
We present a prompt-based learning approach, PromptIR, for All-In-One image restoration.
Our method uses prompts to encode degradation-specific information, which is then used to dynamically guide the restoration network.
PromptIR offers a generic and efficient plugin module with few lightweight prompts.
arXiv Detail & Related papers (2023-06-22T17:59:52Z) - Unleashing Text-to-Image Diffusion Models for Visual Perception [84.41514649568094]
VPD (Visual Perception with a pre-trained diffusion model) is a new framework that exploits the semantic information of a pre-trained text-to-image diffusion model in visual perception tasks.
We show that VPD can be faster adapted to downstream visual perception tasks using the proposed VPD.
arXiv Detail & Related papers (2023-03-03T18:59:47Z) - Learning to Prompt for Vision-Language Models [82.25005817904027]
Vision-language pre-training has emerged as a promising alternative for representation learning.
It shifts from the tradition of using images and discrete labels for learning a fixed set of weights, seen as visual concepts, to aligning images and raw text for two separate encoders.
Such a paradigm benefits from a broader source of supervision and allows zero-shot transfer to downstream tasks.
arXiv Detail & Related papers (2021-09-02T17:57:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.