Textual Prompt Guided Image Restoration
- URL: http://arxiv.org/abs/2312.06162v1
- Date: Mon, 11 Dec 2023 06:56:41 GMT
- Title: Textual Prompt Guided Image Restoration
- Authors: Qiuhai Yan and Aiwen Jiang and Kang Chen and Long Peng and Qiaosi Yi
and Chunjie Zhang
- Abstract summary: "All-in-one" models that can do blind image restoration have been concerned in recent years.
Recent works focus on learning visual prompts from data distribution to identify degradation type.
In this paper, an effective textual prompt guided image restoration model has been proposed.
- Score: 18.78902053873706
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image restoration has always been a cutting-edge topic in the academic and
industrial fields of computer vision. Since degradation signals are often
random and diverse, "all-in-one" models that can do blind image restoration
have been concerned in recent years. Early works require training specialized
headers and tails to handle each degradation of concern, which are manually
cumbersome. Recent works focus on learning visual prompts from data
distribution to identify degradation type. However, the prompts employed in
most of models are non-text, lacking sufficient emphasis on the importance of
human-in-the-loop. In this paper, an effective textual prompt guided image
restoration model has been proposed. In this model, task-specific BERT is
fine-tuned to accurately understand user's instructions and generating textual
prompt guidance. Depth-wise multi-head transposed attentions and gated
convolution modules are designed to bridge the gap between textual prompts and
visual features. The proposed model has innovatively introduced semantic
prompts into low-level visual domain. It highlights the potential to provide a
natural, precise, and controllable way to perform image restoration tasks.
Extensive experiments have been done on public denoising, dehazing and
deraining datasets. The experiment results demonstrate that, compared with
popular state-of-the-art methods, the proposed model can obtain much more
superior performance, achieving accurate recognition and removal of degradation
without increasing model's complexity. Related source codes and data will be
publicly available on github site
https://github.com/MoTong-AI-studio/TextPromptIR.
Related papers
- Review Learning: Advancing All-in-One Ultra-High-Definition Image Restoration Training Method [7.487270862599671]
We propose a new training paradigm for general image restoration models, which we name bfReview Learning.
This approach begins with sequential training of an image restoration model on several degraded datasets, combined with a review mechanism.
We design a lightweight all-purpose image restoration network that can efficiently reason about degraded images with 4K resolution on a single consumer-grade GPU.
arXiv Detail & Related papers (2024-08-13T08:08:45Z) - DaLPSR: Leverage Degradation-Aligned Language Prompt for Real-World Image Super-Resolution [19.33582308829547]
This paper proposes to leverage degradation-aligned language prompt for accurate, fine-grained, and high-fidelity image restoration.
The proposed method achieves a new state-of-the-art perceptual quality level.
arXiv Detail & Related papers (2024-06-24T09:30:36Z) - Enhancing Large Vision Language Models with Self-Training on Image Comprehension [99.9389737339175]
We introduce Self-Training on Image (STIC), which emphasizes a self-training approach specifically for image comprehension.
First, the model self-constructs a preference for image descriptions using unlabeled images.
To further self-improve reasoning on the extracted visual information, we let the model reuse a small portion of existing instruction-tuning data.
arXiv Detail & Related papers (2024-05-30T05:53:49Z) - InstructIR: High-Quality Image Restoration Following Human Instructions [61.1546287323136]
We present the first approach that uses human-written instructions to guide the image restoration model.
Our method, InstructIR, achieves state-of-the-art results on several restoration tasks.
arXiv Detail & Related papers (2024-01-29T18:53:33Z) - Image Captions are Natural Prompts for Text-to-Image Models [70.30915140413383]
We analyze the relationship between the training effect of synthetic data and the synthetic data distribution induced by prompts.
We propose a simple yet effective method that prompts text-to-image generative models to synthesize more informative and diverse training data.
Our method significantly improves the performance of models trained on synthetic training data.
arXiv Detail & Related papers (2023-07-17T14:38:11Z) - PromptIR: Prompting for All-in-One Blind Image Restoration [64.02374293256001]
We present a prompt-based learning approach, PromptIR, for All-In-One image restoration.
Our method uses prompts to encode degradation-specific information, which is then used to dynamically guide the restoration network.
PromptIR offers a generic and efficient plugin module with few lightweight prompts.
arXiv Detail & Related papers (2023-06-22T17:59:52Z) - Unleashing Text-to-Image Diffusion Models for Visual Perception [84.41514649568094]
VPD (Visual Perception with a pre-trained diffusion model) is a new framework that exploits the semantic information of a pre-trained text-to-image diffusion model in visual perception tasks.
We show that VPD can be faster adapted to downstream visual perception tasks using the proposed VPD.
arXiv Detail & Related papers (2023-03-03T18:59:47Z) - Learning to Prompt for Vision-Language Models [82.25005817904027]
Vision-language pre-training has emerged as a promising alternative for representation learning.
It shifts from the tradition of using images and discrete labels for learning a fixed set of weights, seen as visual concepts, to aligning images and raw text for two separate encoders.
Such a paradigm benefits from a broader source of supervision and allows zero-shot transfer to downstream tasks.
arXiv Detail & Related papers (2021-09-02T17:57:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.