SelfPromer: Self-Prompt Dehazing Transformers with Depth-Consistency
- URL: http://arxiv.org/abs/2303.07033v3
- Date: Fri, 15 Mar 2024 14:31:21 GMT
- Title: SelfPromer: Self-Prompt Dehazing Transformers with Depth-Consistency
- Authors: Cong Wang, Jinshan Pan, Wanyu Lin, Jiangxin Dong, Xiao-Ming Wu,
- Abstract summary: This work presents an effective depth-consistency self-prompt Transformer for image dehazing.
It is motivated by an observation that the estimated depths of an image with haze residuals and its clear counterpart vary.
By incorporating the prompt, prompt embedding, and prompt attention into an encoder-decoder network based on VQGAN, we can achieve better perception quality.
- Score: 51.92434113232977
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This work presents an effective depth-consistency self-prompt Transformer for image dehazing. It is motivated by an observation that the estimated depths of an image with haze residuals and its clear counterpart vary. Enforcing the depth consistency of dehazed images with clear ones, therefore, is essential for dehazing. For this purpose, we develop a prompt based on the features of depth differences between the hazy input images and corresponding clear counterparts that can guide dehazing models for better restoration. Specifically, we first apply deep features extracted from the input images to the depth difference features for generating the prompt that contains the haze residual information in the input. Then we propose a prompt embedding module that is designed to perceive the haze residuals, by linearly adding the prompt to the deep features. Further, we develop an effective prompt attention module to pay more attention to haze residuals for better removal. By incorporating the prompt, prompt embedding, and prompt attention into an encoder-decoder network based on VQGAN, we can achieve better perception quality. As the depths of clear images are not available at inference, and the dehazed images with one-time feed-forward execution may still contain a portion of haze residuals, we propose a new continuous self-prompt inference that can iteratively correct the dehazing model towards better haze-free image generation. Extensive experiments show that our method performs favorably against the state-of-the-art approaches on both synthetic and real-world datasets in terms of perception metrics including NIQE, PI, and PIQE.
Related papers
- Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions [30.148969711689773]
We present a novel approach designed to address the complexities posed by challenging, out-of-distribution data in the single-image depth estimation task.
We systematically generate new, user-defined scenes with a comprehensive set of challenges and associated depth information.
This is achieved by leveraging cutting-edge text-to-image diffusion models with depth-aware control.
arXiv Detail & Related papers (2024-07-23T17:59:59Z) - AccDiffusion: An Accurate Method for Higher-Resolution Image Generation [63.53163540340026]
We propose AccDiffusion, an accurate method for patch-wise higher-resolution image generation without training.
An in-depth analysis in this paper reveals an identical text prompt for different patches causes repeated object generation.
Our AccDiffusion, for the first time, proposes to decouple the vanilla image-content-aware prompt into a set of patch-content-aware prompts.
arXiv Detail & Related papers (2024-07-15T14:06:29Z) - Depth Information Assisted Collaborative Mutual Promotion Network for Single Image Dehazing [9.195173526948123]
We propose a dual-task collaborative mutual promotion framework to achieve the dehazing of a single image.
This framework integrates depth estimation and dehazing by a dual-task interaction mechanism.
We show that the proposed method can achieve better performance than that of the state-of-the-art approaches.
arXiv Detail & Related papers (2024-03-02T06:29:44Z) - Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis [65.7968515029306]
We propose a novel Coarse-to-Fine Latent Diffusion (CFLD) method for Pose-Guided Person Image Synthesis (PGPIS)
A perception-refined decoder is designed to progressively refine a set of learnable queries and extract semantic understanding of person images as a coarse-grained prompt.
arXiv Detail & Related papers (2024-02-28T06:07:07Z) - End-to-end Learning for Joint Depth and Image Reconstruction from
Diffracted Rotation [10.896567381206715]
We propose a novel end-to-end learning approach for depth from diffracted rotation.
Our approach requires a significantly less complex model and less training data, yet it is superior to existing methods in the task of monocular depth estimation.
arXiv Detail & Related papers (2022-04-14T16:14:37Z) - Robust Single Image Dehazing Based on Consistent and Contrast-Assisted
Reconstruction [95.5735805072852]
We propose a novel density-variational learning framework to improve the robustness of the image dehzing model.
Specifically, the dehazing network is optimized under the consistency-regularized framework.
Our method significantly surpasses the state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-29T08:11:04Z) - Contrastive Learning for Compact Single Image Dehazing [41.83007400559068]
We propose a novel contrastive regularization (CR) built upon contrastive learning to exploit both the information of hazy images and clear images as negative and positive samples.
CR ensures that the restored image is pulled to closer to the clear image and pushed to far away from the hazy image in the representation space.
Considering trade-off between performance and memory storage, we develop a compact dehazing network based on autoencoder-like framework.
arXiv Detail & Related papers (2021-04-19T14:56:21Z) - Progressive Depth Learning for Single Image Dehazing [56.71963910162241]
Existing dehazing methods often ignore the depth cues and fail in distant areas where heavier haze disturbs the visibility.
We propose a deep end-to-end model that iteratively estimates image depths and transmission maps.
Our approach benefits from explicitly modeling the inner relationship of image depth and transmission map, which is especially effective for distant hazy areas.
arXiv Detail & Related papers (2021-02-21T05:24:18Z) - Self-Supervised Linear Motion Deblurring [112.75317069916579]
Deep convolutional neural networks are state-of-the-art for image deblurring.
We present a differentiable reblur model for self-supervised motion deblurring.
Our experiments demonstrate that self-supervised single image deblurring is really feasible.
arXiv Detail & Related papers (2020-02-10T20:15:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.