Disentangled Contrastive Image Translation for Nighttime Surveillance
- URL: http://arxiv.org/abs/2307.05038v1
- Date: Tue, 11 Jul 2023 06:40:27 GMT
- Title: Disentangled Contrastive Image Translation for Nighttime Surveillance
- Authors: Guanzhou Lan, Bin Zhao, Xuelong Li
- Abstract summary: Nighttime surveillance suffers from degradation due to poor illumination and arduous human annotations.
Existing methods rely on multi-spectral images to perceive objects in the dark, which are troubled by low resolution and color absence.
We argue that the ultimate solution for nighttime surveillance is night-to-day translation, or Night2Day.
This paper contributes a new surveillance dataset called NightSuR. It includes six scenes to support the study on nighttime surveillance.
- Score: 87.03178320662592
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nighttime surveillance suffers from degradation due to poor illumination and
arduous human annotations. It is challengable and remains a security risk at
night. Existing methods rely on multi-spectral images to perceive objects in
the dark, which are troubled by low resolution and color absence. We argue that
the ultimate solution for nighttime surveillance is night-to-day translation,
or Night2Day, which aims to translate a surveillance scene from nighttime to
the daytime while maintaining semantic consistency. To achieve this, this paper
presents a Disentangled Contrastive (DiCo) learning method. Specifically, to
address the poor and complex illumination in the nighttime scenes, we propose a
learnable physical prior, i.e., the color invariant, which provides a stable
perception of a highly dynamic night environment and can be incorporated into
the learning pipeline of neural networks. Targeting the surveillance scenes, we
develop a disentangled representation, which is an auxiliary pretext task that
separates surveillance scenes into the foreground and background with
contrastive learning. Such a strategy can extract the semantics without
supervision and boost our model to achieve instance-aware translation. Finally,
we incorporate all the modules above into generative adversarial networks and
achieve high-fidelity translation. This paper also contributes a new
surveillance dataset called NightSuR. It includes six scenes to support the
study on nighttime surveillance. This dataset collects nighttime images with
different properties of nighttime environments, such as flare and extreme
darkness. Extensive experiments demonstrate that our method outperforms
existing works significantly. The dataset and source code will be released on
GitHub soon.
Related papers
- Night-to-Day Translation via Illumination Degradation Disentanglement [51.77716565167767]
Night-to-Day translation aims to achieve day-like vision for nighttime scenes.
processing night images with complex degradations remains a significant challenge under unpaired conditions.
We propose textbfN2D3 to identify different degradation patterns in nighttime images.
arXiv Detail & Related papers (2024-11-21T08:51:32Z) - Exploring Reliable Matching with Phase Enhancement for Night-time Semantic Segmentation [58.180226179087086]
We propose a novel end-to-end optimized approach, named NightFormer, tailored for night-time semantic segmentation.
Specifically, we design a pixel-level texture enhancement module to acquire texture-aware features hierarchically with phase enhancement and amplified attention.
Our proposed method performs favorably against state-of-the-art night-time semantic segmentation methods.
arXiv Detail & Related papers (2024-08-25T13:59:31Z) - PIG: Prompt Images Guidance for Night-Time Scene Parsing [48.35991796324741]
Unsupervised domain adaptation (UDA) has become the predominant method for studying night scenes.
We propose a Night-Focused Network (NFNet) to learn night-specific features from both target domain images and prompt images.
We conduct experiments on four night-time datasets: NightCity, NightCity+, Dark Zurich, and ACDC.
arXiv Detail & Related papers (2024-06-15T07:06:19Z) - Nighttime Thermal Infrared Image Colorization with Feedback-based Object
Appearance Learning [27.58748298687474]
We propose a generative adversarial network incorporating feedback-based object appearance learning (FoalGAN)
FoalGAN is effective for appearance learning of small objects, but also outperforms other image translation methods in terms of semantic preservation and edge consistency.
arXiv Detail & Related papers (2023-10-24T09:59:55Z) - Boosting Night-time Scene Parsing with Learnable Frequency [53.05778451012621]
Night-Time Scene Parsing (NTSP) is essential to many vision applications, especially for autonomous driving.
Most of the existing methods are proposed for day-time scene parsing.
We show that our method performs favorably against the state-of-the-art methods on the NightCity, NightCity+ and BDD100K-night datasets.
arXiv Detail & Related papers (2022-08-30T13:09:59Z) - Let There be Light: Improved Traffic Surveillance via Detail Preserving
Night-to-Day Transfer [19.33490492872067]
We propose a framework to alleviate the accuracy decline when object detection is taken to adverse conditions by using image translation method.
To alleviate the detail corruptions caused by Generative Adversarial Networks (GANs), we propose to utilize Kernel Prediction Network (KPN) based method to refine the nighttime to daytime image translation.
arXiv Detail & Related papers (2021-05-11T13:18:50Z) - Night-time Scene Parsing with a Large Real Dataset [67.11211537439152]
We aim to address the night-time scene parsing (NTSP) problem, which has two main challenges.
To tackle the scarcity of night-time data, we collect a novel labeled dataset, named it NightCity, of 4,297 real night-time images.
We also propose an exposure-aware framework to address the NTSP problem through augmenting the segmentation process with explicitly learned exposure features.
arXiv Detail & Related papers (2020-03-15T18:11:34Z) - Translating multispectral imagery to nighttime imagery via conditional
generative adversarial networks [24.28488767429697]
This study explores the potential of conditional Generative Adversarial Networks (cGAN) in translating multispectral imagery to nighttime imagery.
A popular cGAN framework, pix2pix, was adopted and modified to facilitate this translation.
With the additional social media data, the generated nighttime imagery can be very similar to the ground-truth imagery.
arXiv Detail & Related papers (2019-12-28T03:20:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.