Test-Time Adaptation for Nighttime Color-Thermal Semantic Segmentation
- URL: http://arxiv.org/abs/2307.04470v2
- Date: Thu, 30 Nov 2023 03:04:02 GMT
- Title: Test-Time Adaptation for Nighttime Color-Thermal Semantic Segmentation
- Authors: Yexin Liu, Weiming Zhang, Guoyang Zhao, Jinjing Zhu, Athanasios
Vasilakos, and Lin Wang
- Abstract summary: We propose the first test-time adaptation framework, dubbed Night-TTA, to address the problems for nighttime RGBT semantic segmentation.
Our method achieves state-of-the-art (SoTA) performance with a 13.07% boost in mIoU.
- Score: 17.546960391700985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ability to scene understanding in adverse visual conditions, e.g.,
nighttime, has sparked active research for RGB-Thermal (RGB-T) semantic
segmentation. However, it is essentially hampered by two critical problems: 1)
the day-night gap of RGB images is larger than that of thermal images, and 2)
the class-wise performance of RGB images at night is not consistently higher or
lower than that of thermal images. we propose the first test-time adaptation
(TTA) framework, dubbed Night-TTA, to address the problems for nighttime RGBT
semantic segmentation without access to the source (daytime) data during
adaptation. Our method enjoys three key technical parts. Firstly, as one
modality (e.g., RGB) suffers from a larger domain gap than that of the other
(e.g., thermal), Imaging Heterogeneity Refinement (IHR) employs an interaction
branch on the basis of RGB and thermal branches to prevent cross-modal
discrepancy and performance degradation. Then, Class Aware Refinement (CAR) is
introduced to obtain reliable ensemble logits based on pixel-level distribution
aggregation of the three branches. In addition, we also design a specific
learning scheme for our TTA framework, which enables the ensemble logits and
three student logits to collaboratively learn to improve the quality of
predictions during the testing phase of our Night TTA. Extensive experiments
show that our method achieves state-of-the-art (SoTA) performance with a 13.07%
boost in mIoU.
Related papers
- Alignment-Free RGBT Salient Object Detection: Semantics-guided Asymmetric Correlation Network and A Unified Benchmark [15.435695491233982]
RGB and Thermal (RGBT) Salient Object Detection (SOD) aims to achieve high-quality saliency prediction.
Existing methods are tailored for manually aligned image pairs, which are labor-intensive.
We make the first attempt to address RGBT SOD for initially captured RGB and thermal image pairs without manual alignment.
arXiv Detail & Related papers (2024-06-03T01:01:58Z) - Ternary-type Opacity and Hybrid Odometry for RGB-only NeRF-SLAM [62.23809541385653]
We study why ternary-type opacity is well-suited and desired for the task at hand.
We propose a simple yet novel visual odometry scheme that uses a hybrid combination of volumetric and warping-based image renderings.
arXiv Detail & Related papers (2023-12-20T18:03:17Z) - Residual Spatial Fusion Network for RGB-Thermal Semantic Segmentation [19.41334573257174]
Traditional methods mostly use RGB images which are heavily affected by lighting conditions, eg, darkness.
Recent studies show thermal images are robust to the night scenario as a compensating modality for segmentation.
This work proposes a Residual Spatial Fusion Network (RSFNet) for RGB-T semantic segmentation.
arXiv Detail & Related papers (2023-06-17T14:28:08Z) - Complementary Random Masking for RGB-Thermal Semantic Segmentation [63.93784265195356]
RGB-thermal semantic segmentation is a potential solution to achieve reliable semantic scene understanding in adverse weather and lighting conditions.
This paper proposes 1) a complementary random masking strategy of RGB-T images and 2) self-distillation loss between clean and masked input modalities.
We achieve state-of-the-art performance over three RGB-T semantic segmentation benchmarks.
arXiv Detail & Related papers (2023-03-30T13:57:21Z) - Does Thermal Really Always Matter for RGB-T Salient Object Detection? [153.17156598262656]
This paper proposes a network named TNet to solve the RGB-T salient object detection (SOD) task.
In this paper, we introduce a global illumination estimation module to predict the global illuminance score of the image.
On the other hand, we introduce a two-stage localization and complementation module in the decoding phase to transfer object localization cue and internal integrity cue in thermal features to the RGB modality.
arXiv Detail & Related papers (2022-10-09T13:50:12Z) - Mirror Complementary Transformer Network for RGB-thermal Salient Object
Detection [16.64781797503128]
RGB-thermal object detection (RGB-T SOD) aims to locate the common prominent objects of an aligned visible and thermal infrared image pair.
In this paper, we propose a novel mirror complementary Transformer network (MCNet) for RGB-T SOD.
Experiments on benchmark and VT723 datasets show that the proposed method outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2022-07-07T20:26:09Z) - RGB-Multispectral Matching: Dataset, Learning Methodology, Evaluation [49.28588927121722]
We address the problem of registering synchronized color (RGB) and multi-spectral (MS) images featuring very different resolution by solving stereo matching correspondences.
We introduce a novel RGB-MS dataset framing 13 different scenes in indoor environments and providing a total of 34 image pairs annotated with semi-dense, high-resolution ground-truth labels.
To tackle the task, we propose a deep learning architecture trained in a self-supervised manner by exploiting a further RGB camera.
arXiv Detail & Related papers (2022-06-14T17:59:59Z) - Temporal Aggregation for Adaptive RGBT Tracking [14.00078027541162]
We propose an RGBT tracker which takes clues into account for robust appearance model learning.
Unlike most existing RGBT trackers that implement object tracking tasks with only spatial information included, temporal information is further considered in this method.
arXiv Detail & Related papers (2022-01-22T02:31:56Z) - Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD
Images [69.5662419067878]
Grounding referring expressions in RGBD image has been an emerging field.
We present a novel task of 3D visual grounding in single-view RGBD image where the referred objects are often only partially scanned due to occlusion.
Our approach first fuses the language and the visual features at the bottom level to generate a heatmap that localizes the relevant regions in the RGBD image.
Then our approach conducts an adaptive feature learning based on the heatmap and performs the object-level matching with another visio-linguistic fusion to finally ground the referred object.
arXiv Detail & Related papers (2021-03-14T11:18:50Z) - Bi-directional Cross-Modality Feature Propagation with
Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation.
Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion.
In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z) - HeatNet: Bridging the Day-Night Domain Gap in Semantic Segmentation with
Thermal Images [26.749261270690425]
Real-world driving scenarios entail adverse environmental conditions such as nighttime illumination or glare.
We propose a multimodal semantic segmentation model that can be applied during daytime and nighttime.
Besides RGB images, we leverage thermal images, making our network significantly more robust.
arXiv Detail & Related papers (2020-03-10T11:36:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.