PCSGAN: Perceptual Cyclic-Synthesized Generative Adversarial Networks
for Thermal and NIR to Visible Image Transformation
- URL: http://arxiv.org/abs/2002.07082v2
- Date: Thu, 6 Aug 2020 11:50:33 GMT
- Title: PCSGAN: Perceptual Cyclic-Synthesized Generative Adversarial Networks
for Thermal and NIR to Visible Image Transformation
- Authors: Kancharagunta Kishan Babu and Shiv Ram Dubey
- Abstract summary: Perceptual Cyclic-Synthesized Generative Adrial Network (PCSGAN) is introduced to address these issues.
The PCSGAN uses the combination of adversarial (i.e., feature based) losses along with the pixel-wise and the adversarial losses.
The proposed PCSGAN outperforms the state-of-the-art image models, including Pix2pix, DualGAN, CycleGAN, PS2GAN, and PAN in terms of the SSIM, MSE, PSNR and LPIPS evaluation measures.
- Score: 17.205434613674104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many real world scenarios, it is difficult to capture the images in the
visible light spectrum (VIS) due to bad lighting conditions. However, the
images can be captured in such scenarios using Near-Infrared (NIR) and Thermal
(THM) cameras. The NIR and THM images contain the limited details. Thus, there
is a need to transform the images from THM/NIR to VIS for better understanding.
However, it is non-trivial task due to the large domain discrepancies and lack
of abundant datasets. Nowadays, Generative Adversarial Network (GAN) is able to
transform the images from one domain to another domain. Most of the available
GAN based methods use the combination of the adversarial and the pixel-wise
losses (like $L_1$ or $L_2$) as the objective function for training. The
quality of transformed images in case of THM/NIR to VIS transformation is still
not up to the mark using such objective function. Thus, better objective
functions are needed to improve the quality, fine details and realism of the
transformed images. A new model for THM/NIR to VIS image transformation called
Perceptual Cyclic-Synthesized Generative Adversarial Network (PCSGAN) is
introduced to address these issues. The PCSGAN uses the combination of the
perceptual (i.e., feature based) losses along with the pixel-wise and the
adversarial losses. Both the quantitative and qualitative measures are used to
judge the performance of the PCSGAN model over the WHU-IIP face and the RGB-NIR
scene datasets. The proposed PCSGAN outperforms the state-of-the-art image
transformation models, including Pix2pix, DualGAN, CycleGAN, PS2GAN, and PAN in
terms of the SSIM, MSE, PSNR and LPIPS evaluation measures. The code is
available at https://github.com/KishanKancharagunta/PCSGAN.
Related papers
- LapGSR: Laplacian Reconstructive Network for Guided Thermal Super-Resolution [1.747623282473278]
Fusing multiple modalities to produce high-resolution images often requires dense models with millions of parameters and a heavy computational load.
We propose LapGSR, a multimodal, lightweight, generative model incorporating Laplacian image pyramids for guided thermal super-resolution.
arXiv Detail & Related papers (2024-11-12T12:23:19Z) - CSIM: A Copula-based similarity index sensitive to local changes for Image quality assessment [2.3874115898130865]
Image similarity metrics play an important role in computer vision applications, as they are used in image processing, computer vision and machine learning.
Existing metrics, such as PSNR, MSE, SSIM, ISSM and FSIM, often face limitations in terms of either speed, complexity or sensitivity to small changes in images.
A novel image similarity metric, namely CSIM, that combines real-time while being sensitive to subtle image variations is investigated in this paper.
arXiv Detail & Related papers (2024-10-02T10:46:05Z) - DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image
Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments.
Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features.
Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z) - Beyond Learned Metadata-based Raw Image Reconstruction [86.1667769209103]
Raw images have distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels.
They are not widely adopted by general users due to their substantial storage requirements.
We propose a novel framework that learns a compact representation in the latent space, serving as metadata.
arXiv Detail & Related papers (2023-06-21T06:59:07Z) - Contextual Learning in Fourier Complex Field for VHR Remote Sensing
Images [64.84260544255477]
transformer-based models demonstrated outstanding potential for learning high-order contextual relationships from natural images with general resolution (224x224 pixels)
We propose a complex self-attention (CSA) mechanism to model the high-order contextual information with less than half computations of naive SA.
By stacking various layers of CSA blocks, we propose the Fourier Complex Transformer (FCT) model to learn global contextual information from VHR aerial images.
arXiv Detail & Related papers (2022-10-28T08:13:33Z) - Optimizing Vision Transformers for Medical Image Segmentation and
Few-Shot Domain Adaptation [11.690799827071606]
We propose Convolutional Swin-Unet (CS-Unet) transformer blocks and optimise their settings with relation to patch embedding, projection, the feed-forward network, up sampling and skip connections.
CS-Unet can be trained from scratch and inherits the superiority of convolutions in each feature process phase.
Experiments show that CS-Unet without pre-training surpasses other state-of-the-art counterparts by large margins on two medical CT and MRI datasets with fewer parameters.
arXiv Detail & Related papers (2022-10-14T19:18:52Z) - Spatially-Adaptive Image Restoration using Distortion-Guided Networks [51.89245800461537]
We present a learning-based solution for restoring images suffering from spatially-varying degradations.
We propose SPAIR, a network design that harnesses distortion-localization information and dynamically adjusts to difficult regions in the image.
arXiv Detail & Related papers (2021-08-19T11:02:25Z) - Adversarial Generation of Continuous Images [31.92891885615843]
In this paper, we propose two novel architectural techniques for building INR-based image decoders.
We use them to build a state-of-the-art continuous image GAN.
Our proposed INR-GAN architecture improves the performance of continuous image generators by several times.
arXiv Detail & Related papers (2020-11-24T11:06:40Z) - Conditional Sequential Modulation for Efficient Global Image Retouching [45.99310982782054]
Photo retouching aims at enhancing the aesthetic visual quality of images that suffer from photographic defects such as over/under exposure, poor contrast, inharmonious saturation.
In this paper, we investigate some commonly-used retouching operations and mathematically find that these pixel-independent operations can be approximated or formulated by multi-layer perceptrons (MLPs)
We propose an extremely light-weight framework - Sequential Retouching Network (CSRNet) - for efficient global image retouching.
arXiv Detail & Related papers (2020-09-22T08:32:04Z) - Visual Transformers: Token-based Image Representation and Processing for
Computer Vision [67.55770209540306]
Visual Transformer ( VT) operates in a semantic token space, judiciously attending to different image parts based on context.
Using an advanced training recipe, our VTs significantly outperform their convolutional counterparts.
For semantic segmentation on LIP and COCO-stuff, VT-based feature pyramid networks (FPN) achieve 0.35 points higher mIoU while reducing the FPN module's FLOPs by 6.5x.
arXiv Detail & Related papers (2020-06-05T20:49:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.