T2V-DDPM: Thermal to Visible Face Translation using Denoising Diffusion
Probabilistic Models
- URL: http://arxiv.org/abs/2209.08814v1
- Date: Mon, 19 Sep 2022 07:59:32 GMT
- Title: T2V-DDPM: Thermal to Visible Face Translation using Denoising Diffusion
Probabilistic Models
- Authors: Nithin Gopalakrishnan Nair and Vishal M. Patel
- Abstract summary: We propose a Denoising Diffusion Probabilistic Model (DDPM) based solution for Thermal-to-Visible (T2V) image translation.
During training, the model learns the conditional distribution of visible facial images given their corresponding thermal image.
We achieve the state-of-the-art results on multiple datasets.
- Score: 71.94264837503135
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern-day surveillance systems perform person recognition using deep
learning-based face verification networks. Most state-of-the-art facial
verification systems are trained using visible spectrum images. But, acquiring
images in the visible spectrum is impractical in scenarios of low-light and
nighttime conditions, and often images are captured in an alternate domain such
as the thermal infrared domain. Facial verification in thermal images is often
performed after retrieving the corresponding visible domain images. This is a
well-established problem often known as the Thermal-to-Visible (T2V) image
translation. In this paper, we propose a Denoising Diffusion Probabilistic
Model (DDPM) based solution for T2V translation specifically for facial images.
During training, the model learns the conditional distribution of visible
facial images given their corresponding thermal image through the diffusion
process. During inference, the visible domain image is obtained by starting
from Gaussian noise and performing denoising repeatedly. The existing inference
process for DDPMs is stochastic and time-consuming. Hence, we propose a novel
inference strategy for speeding up the inference time of DDPMs, specifically
for the problem of T2V image translation. We achieve the state-of-the-art
results on multiple datasets. The code and pretrained models are publically
available at http://github.com/Nithin-GK/T2V-DDPM
Related papers
- TC-PDM: Temporally Consistent Patch Diffusion Models for Infrared-to-Visible Video Translation [25.542902579879367]
This paper proposes a novel diffusion method, dubbed Temporally Consistent Patch Diffusion Models (TC-DPM)
Our method faithfully preserves the semantic structure of generated visible images.
Experiment shows that TC-PDM outperforms state-of-the-art methods by 35.3% in FVD for infrared-to-visible video translation and by 6.1% in AP50 for day-to-night object detection.
arXiv Detail & Related papers (2024-08-26T12:43:48Z) - Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model [31.70050311326183]
Diffusion models tend to generate videos with less motion than expected.
We address this issue from both inference and training aspects.
Our methods outperform baselines by producing higher motion scores with lower errors.
arXiv Detail & Related papers (2024-06-22T04:56:16Z) - Towards Understanding the Working Mechanism of Text-to-Image Diffusion Model [57.24046436423511]
Recently, the strong latent Diffusion Probabilistic Model (DPM) has been applied to high-quality Text-to-Image (T2I) generation.
We explore the mechanism behind DPM by examining the intermediate statuses during the gradual denoising generation process.
We propose to apply this observation to accelerate the process of T2I generation by properly removing text guidance.
arXiv Detail & Related papers (2024-05-24T08:12:41Z) - Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation [72.90144343056227]
We explore the visual representations produced from a pre-trained text-to-video (T2V) diffusion model for video understanding tasks.
We introduce a novel framework, termed "VD-IT", tailored with dedicatedly designed components built upon a fixed T2V model.
Our VD-IT achieves highly competitive results, surpassing many existing state-of-the-art methods.
arXiv Detail & Related papers (2024-03-18T17:59:58Z) - Binary Noise for Binary Tasks: Masked Bernoulli Diffusion for Unsupervised Anomaly Detection [1.5550533143704954]
We present a novel and fast unsupervised anomaly detection approach based on latent Bernoulli diffusion models.
We achieve state-of-the-art performance compared to other diffusion-based unsupervised anomaly detection algorithms.
arXiv Detail & Related papers (2024-03-18T11:15:03Z) - Visible to Thermal image Translation for improving visual task in low
light conditions [0.0]
We have collected images from two different locations using the Parrot Anafi Thermal drone.
We created a two-stream network, preprocessed, augmented, the image data, and trained the generator and discriminator models from scratch.
The findings demonstrate that it is feasible to translate RGB training data to thermal data using GAN.
arXiv Detail & Related papers (2023-10-31T05:18:53Z) - Gradpaint: Gradient-Guided Inpainting with Diffusion Models [71.47496445507862]
Denoising Diffusion Probabilistic Models (DDPMs) have recently achieved remarkable results in conditional and unconditional image generation.
We present GradPaint, which steers the generation towards a globally coherent image.
We generalizes well to diffusion models trained on various datasets, improving upon current state-of-the-art supervised and unsupervised methods.
arXiv Detail & Related papers (2023-09-18T09:36:24Z) - Learning Structure-Guided Diffusion Model for 2D Human Pose Estimation [71.24808323646167]
We propose textbfDiffusionPose, a new scheme for learning keypoints heatmaps by a neural network.
During training, the keypoints are diffused to random distribution by adding noises and the diffusion model learns to recover ground-truth heatmaps from noised heatmaps.
Experiments show the prowess of our scheme with improvements of 1.6, 1.2, and 1.2 mAP on widely-used COCO, CrowdPose, and AI Challenge datasets.
arXiv Detail & Related papers (2023-06-29T16:24:32Z) - Thermal to Visible Image Synthesis under Atmospheric Turbulence [67.99407460140263]
In biometrics and surveillance, thermal imagining modalities are often used to capture images in low-light and nighttime conditions.
Such imaging systems often suffer from atmospheric turbulence, which introduces severe blur and deformation artifacts to the captured images.
An end-to-end reconstruction method is proposed which can directly transform thermal images into visible-spectrum images.
arXiv Detail & Related papers (2022-04-06T19:47:41Z) - UNIT-DDPM: UNpaired Image Translation with Denoising Diffusion
Probabilistic Models [19.499490172426427]
We propose a novel unpaired image-to-image translation method that uses denoising diffusion probabilistic models without requiring adversarial training.
Our method, UNpaired Image Translation with Denoising Diffusion Probabilistic Models (UNIT-DDPM), trains a generative model to infer the joint distribution of images over both domains as a Markov chain.
arXiv Detail & Related papers (2021-04-12T11:22:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.