StawGAN: Structural-Aware Generative Adversarial Networks for Infrared
Image Translation
- URL: http://arxiv.org/abs/2305.10882v1
- Date: Thu, 18 May 2023 11:22:33 GMT
- Title: StawGAN: Structural-Aware Generative Adversarial Networks for Infrared
Image Translation
- Authors: Luigi Sigillo, Eleonora Grassucci, Danilo Comminiello
- Abstract summary: We introduce a novel model that focuses on enhancing the quality of the target generation without merely colorizing it.
We test our model on aerial images of the DroneVeichle dataset containing RGB-IR paired images.
- Score: 7.098759778181621
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper addresses the problem of translating night-time thermal infrared
images, which are the most adopted image modalities to analyze night-time
scenes, to daytime color images (NTIT2DC), which provide better perceptions of
objects. We introduce a novel model that focuses on enhancing the quality of
the target generation without merely colorizing it. The proposed structural
aware (StawGAN) enables the translation of better-shaped and high-definition
objects in the target domain. We test our model on aerial images of the
DroneVeichle dataset containing RGB-IR paired images. The proposed approach
produces a more accurate translation with respect to other state-of-the-art
image translation models. The source code is available at
https://github.com/LuigiSigillo/StawGAN
Related papers
- Supervised Image Translation from Visible to Infrared Domain for Object Detection [1.7851018240619703]
This study aims to learn a translation from visible to infrared imagery, bridging the domain gap between the two modalities.
We adopt a two-stage training strategy with a Generative Adversarial Network and an object detection model.
Images so generated are used to train standard object detection frameworks including Yolov5, Mask and Faster RCNN.
arXiv Detail & Related papers (2024-08-03T18:51:04Z) - Visible to Thermal image Translation for improving visual task in low
light conditions [0.0]
We have collected images from two different locations using the Parrot Anafi Thermal drone.
We created a two-stream network, preprocessed, augmented, the image data, and trained the generator and discriminator models from scratch.
The findings demonstrate that it is feasible to translate RGB training data to thermal data using GAN.
arXiv Detail & Related papers (2023-10-31T05:18:53Z) - Nighttime Thermal Infrared Image Colorization with Feedback-based Object
Appearance Learning [27.58748298687474]
We propose a generative adversarial network incorporating feedback-based object appearance learning (FoalGAN)
FoalGAN is effective for appearance learning of small objects, but also outperforms other image translation methods in terms of semantic preservation and edge consistency.
arXiv Detail & Related papers (2023-10-24T09:59:55Z) - Guided Image-to-Image Translation by Discriminator-Generator
Communication [71.86347329356244]
The goal of Image-to-image (I2I) translation is to transfer an image from a source domain to a target domain.
One major branch of this research is to formulate I2I translation based on Generative Adversarial Network (GAN)
arXiv Detail & Related papers (2023-03-07T02:29:36Z) - Depth- and Semantics-aware Multi-modal Domain Translation: Generating 3D Panoramic Color Images from LiDAR Point Clouds [0.7234862895932991]
This work presents a new conditional generative model, named TITAN-Next, for cross-domain image-to-image translation in a multi-modal setup between LiDAR and camera sensors.
We claim that this is the first framework of its kind and it has practical applications in autonomous vehicles such as providing a fail-safe mechanism and augmenting available data in the target image domain.
arXiv Detail & Related papers (2023-02-15T13:48:10Z) - Photorealistic Text-to-Image Diffusion Models with Deep Language
Understanding [53.170767750244366]
Imagen is a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding.
To assess text-to-image models in greater depth, we introduce DrawBench, a comprehensive and challenging benchmark for text-to-image models.
arXiv Detail & Related papers (2022-05-23T17:42:53Z) - IR-GAN: Image Manipulation with Linguistic Instruction by Increment
Reasoning [110.7118381246156]
Increment Reasoning Generative Adversarial Network (IR-GAN) aims to reason consistency between visual increment in images and semantic increment in instructions.
First, we introduce the word-level and instruction-level instruction encoders to learn user's intention from history-correlated instructions as semantic increment.
Second, we embed the representation of semantic increment into that of source image for generating target image, where source image plays the role of referring auxiliary.
arXiv Detail & Related papers (2022-04-02T07:48:39Z) - Thermal Infrared Image Colorization for Nighttime Driving Scenes with
Top-Down Guided Attention [14.527765677864913]
We propose a toP-down attEntion And gRadient aLignment based GAN, referred to as PearlGAN.
A top-down guided attention module and an elaborate attentional loss are first designed to reduce the semantic encoding ambiguity during translation.
In addition, pixel-level annotation is carried out on a subset of FLIR and KAIST datasets to evaluate the semantic preservation performance of multiple translation methods.
arXiv Detail & Related papers (2021-04-29T14:35:25Z) - Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD
Images [69.5662419067878]
Grounding referring expressions in RGBD image has been an emerging field.
We present a novel task of 3D visual grounding in single-view RGBD image where the referred objects are often only partially scanned due to occlusion.
Our approach first fuses the language and the visual features at the bottom level to generate a heatmap that localizes the relevant regions in the RGBD image.
Then our approach conducts an adaptive feature learning based on the heatmap and performs the object-level matching with another visio-linguistic fusion to finally ground the referred object.
arXiv Detail & Related papers (2021-03-14T11:18:50Z) - Unsupervised Image-to-Image Translation via Pre-trained StyleGAN2
Network [73.5062435623908]
We propose a new I2I translation method that generates a new model in the target domain via a series of model transformations.
By feeding the latent vector into the generated model, we can perform I2I translation between the source domain and target domain.
arXiv Detail & Related papers (2020-10-12T13:51:40Z) - Structural-analogy from a Single Image Pair [118.61885732829117]
In this paper, we explore the capabilities of neural networks to understand image structure given only a single pair of images, A and B.
We generate an image that keeps the appearance and style of B, but has a structural arrangement that corresponds to A.
Our method can be used to generate high quality imagery in other conditional generation tasks utilizing images A and B only.
arXiv Detail & Related papers (2020-04-05T14:51:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.