Related papers: Prompt-based test-time real image dehazing: a novel pipeline

Prompt-based test-time real image dehazing: a novel pipeline

URL: http://arxiv.org/abs/2309.17389v5
Date: Sun, 27 Oct 2024 04:33:39 GMT
Title: Prompt-based test-time real image dehazing: a novel pipeline
Authors: Zixuan Chen, Zewei He, Ziqian Lu, Xuecheng Sun, Zhe-Ming Lu,
Abstract summary: We present Prompt-based Test-Time Dehazing (PTTD) to help generate visually pleasing results of real-captured hazy images. We experimentally observe that given a dehazing model trained on synthetic data, fine-tuning the statistics (ie, mean and standard deviation) of encoding features is able to narrow the domain gap. PTTD is model-agnostic and can be equipped with various state-of-the-art dehazing models trained on synthetic hazy-clean pairs to tackle the real image dehazing task.
Score: 9.229160145438469
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing methods attempt to improve models' generalization ability on real-world hazy images by exploring well-designed training schemes (\eg, CycleGAN, prior loss). However, most of them need very complicated training procedures to achieve satisfactory results. For the first time, we present a novel pipeline called Prompt-based Test-Time Dehazing (PTTD) to help generate visually pleasing results of real-captured hazy images during the inference phase. We experimentally observe that given a dehazing model trained on synthetic data, fine-tuning the statistics (\ie, mean and standard deviation) of encoding features is able to narrow the domain gap, boosting the performance of real image dehazing. Accordingly, we first apply a prompt generation module (PGM) to generate a visual prompt, which is the reference of appropriate statistical perturbations for mean and standard deviation. Then, we employ a feature adaptation module (FAM) into the existing dehazing models for adjusting the original statistics with the guidance of the generated prompt. PTTD is model-agnostic and can be equipped with various state-of-the-art dehazing models trained on synthetic hazy-clean pairs to tackle the real image dehazing task. Extensive experimental results demonstrate that our PTTD is effective, achieving superior performance against state-of-the-art dehazing methods in real-world scenarios. The code is available at \url{https://github.com/cecret3350/PTTD-Dehazing}.

Related papers

Solving Inverse Problems with FLAIR [59.02385492199431]
Flow-based latent generative models are able to generate images with remarkable quality, even enabling text-to-image generation.<n>We present FLAIR, a novel training free variational framework that leverages flow-based generative models as a prior for inverse problems.<n>Results on standard imaging benchmarks demonstrate that FLAIR consistently outperforms existing diffusion- and flow-based methods in terms of reconstruction quality and sample diversity.
arXiv Detail & Related papers (2025-06-03T09:29:47Z)
Learning Hazing to Dehazing: Towards Realistic Haze Generation for Real-World Image Dehazing [59.43187521828543]
We introduce a novel hazing-dehazing pipeline consisting of a Realistic Hazy Image Generation framework (HazeGen) and a Diffusion-based Dehazing framework (DiffDehaze) HazeGen harnesses robust generative diffusion priors of real-world hazy images embedded in a pre-trained text-to-image diffusion model. By employing specialized hybrid training and blended sampling strategies, HazeGen produces realistic and diverse hazy images as high-quality training data for DiffDehaze.
arXiv Detail & Related papers (2025-03-25T01:55:39Z)
Fine Tuning Text-to-Image Diffusion Models for Correcting Anomalous Images [0.0]
This study proposes a method to mitigate such issues by fine-tuning the Stable Diffusion 3 model using the DreamBooth technique. Experimental results targeting the prompt "lying on the grass/street" demonstrate that the fine-tuned model shows improved performance in visual evaluation and metrics such as Structural Similarity Index (SSIM), Peak Signal-to-Noise Ratio (PSNR), and Frechet Inception Distance (FID)
arXiv Detail & Related papers (2024-09-23T00:51:47Z)
AssemAI: Interpretable Image-Based Anomaly Detection for Manufacturing Pipelines [0.0]
Anomaly detection in manufacturing pipelines remains a critical challenge, intensified by the complexity and variability of industrial environments. This paper introduces AssemAI, an interpretable image-based anomaly detection system tailored for smart manufacturing pipelines.
arXiv Detail & Related papers (2024-08-05T01:50:09Z)
Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks. We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z)
HazeCLIP: Towards Language Guided Real-World Image Dehazing [62.4454483961341]
Existing methods have achieved remarkable performance in single image dehazing, particularly on synthetic datasets. This paper introduces HazeCLIP, a language-guided adaptation framework designed to enhance the real-world performance of pre-trained dehazing networks.
arXiv Detail & Related papers (2024-07-18T17:18:25Z)
Few-shot Online Anomaly Detection and Segmentation [29.693357653538474]
This paper focuses on addressing the challenging yet practical few-shot online anomaly detection and segmentation (FOADS) task. Under the FOADS framework, models are trained on a few-shot normal dataset, followed by inspection and improvement of their capabilities by leveraging unlabeled streaming data containing both normal and abnormal samples simultaneously. In order to achieve improved performance with limited training samples, we employ multi-scale feature embedding extracted from a CNN pre-trained on ImageNet to obtain a robust representation.
arXiv Detail & Related papers (2024-03-27T02:24:00Z)
Fine Structure-Aware Sampling: A New Sampling Training Scheme for Pixel-Aligned Implicit Models in Single-View Human Reconstruction [98.30014795224432]
We introduce Fine Structured-Aware Sampling (FSS) to train pixel-aligned implicit models for single-view human reconstruction. FSS proactively adapts to the thickness and complexity of surfaces. It also proposes a mesh thickness loss signal for pixel-aligned implicit models.
arXiv Detail & Related papers (2024-02-29T14:26:46Z)
Each Test Image Deserves A Specific Prompt: Continual Test-Time Adaptation for 2D Medical Image Segmentation [14.71883381837561]
Cross-domain distribution shift is a significant obstacle to deploying the pre-trained semantic segmentation model in real-world applications. Test-time adaptation has proven its effectiveness in tackling the cross-domain distribution shift during inference. We propose the Visual Prompt-based Test-Time Adaptation (VPTTA) method to train a specific prompt for each test image to align the statistics in the batch normalization layers.
arXiv Detail & Related papers (2023-11-30T09:03:47Z)
Efficient Test-Time Adaptation for Super-Resolution with Second-Order Degradation and Reconstruction [62.955327005837475]
Image super-resolution (SR) aims to learn a mapping from low-resolution (LR) to high-resolution (HR) using paired HR-LR training images. We present an efficient test-time adaptation framework for SR, named SRTTA, which is able to quickly adapt SR models to test domains with different/unknown degradation types.
arXiv Detail & Related papers (2023-10-29T13:58:57Z)
Generalized Face Forgery Detection via Adaptive Learning for Pre-trained Vision Transformer [54.32283739486781]
We present a textbfForgery-aware textbfAdaptive textbfVision textbfTransformer (FA-ViT) under the adaptive learning paradigm. FA-ViT achieves 93.83% and 78.32% AUC scores on Celeb-DF and DFDC datasets in the cross-dataset evaluation.
arXiv Detail & Related papers (2023-09-20T06:51:11Z)
Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks. We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z)
Score-based diffusion models for accelerated MRI [35.3148116010546]
We introduce a way to sample data from a conditional distribution given the measurements, such that the model can be readily used for solving inverse problems in imaging. Our model requires magnitude images only for training, and yet is able to reconstruct complex-valued data, and even extends to parallel imaging.
arXiv Detail & Related papers (2021-10-08T08:42:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.