Related papers: Unsupervised Foggy Scene Understanding via Self Spatial-Temporal Label Diffusion

Unsupervised Foggy Scene Understanding via Self Spatial-Temporal Label Diffusion

URL: http://arxiv.org/abs/2206.04879v1
Date: Fri, 10 Jun 2022 05:16:50 GMT
Title: Unsupervised Foggy Scene Understanding via Self Spatial-Temporal Label Diffusion
Authors: Liang Liao, Wenyi Chen, Jing Xiao, Zheng Wang, Chia-Wen Lin, Shin'ichi Satoh
Abstract summary: We exploit the characteristics of the foggy image sequence of driving scenes to densify the confident pseudo labels. Based on the two discoveries of local spatial similarity and adjacent temporal correspondence of the sequential image data, we propose a novel Target-Domain driven pseudo label Diffusion scheme. Our scheme helps the adaptive model achieve 51.92% and 53.84% mean intersection-over-union (mIoU) on two publicly available natural foggy datasets.
Score: 51.11295961195151
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Understanding foggy image sequence in the driving scenes is critical for autonomous driving, but it remains a challenging task due to the difficulty in collecting and annotating real-world images of adverse weather. Recently, the self-training strategy has been considered a powerful solution for unsupervised domain adaptation, which iteratively adapts the model from the source domain to the target domain by generating target pseudo labels and re-training the model. However, the selection of confident pseudo labels inevitably suffers from the conflict between sparsity and accuracy, both of which will lead to suboptimal models. To tackle this problem, we exploit the characteristics of the foggy image sequence of driving scenes to densify the confident pseudo labels. Specifically, based on the two discoveries of local spatial similarity and adjacent temporal correspondence of the sequential image data, we propose a novel Target-Domain driven pseudo label Diffusion (TDo-Dif) scheme. It employs superpixels and optical flows to identify the spatial similarity and temporal correspondence, respectively and then diffuses the confident but sparse pseudo labels within a superpixel or a temporal corresponding pair linked by the flow. Moreover, to ensure the feature similarity of the diffused pixels, we introduce local spatial similarity loss and temporal contrastive loss in the model re-training stage. Experimental results show that our TDo-Dif scheme helps the adaptive model achieve 51.92% and 53.84% mean intersection-over-union (mIoU) on two publicly available natural foggy datasets (Foggy Zurich and Foggy Driving), which exceeds the state-of-the-art unsupervised domain adaptive semantic segmentation methods. Models and data can be found at https://github.com/velor2012/TDo-Dif.

Related papers

Multi-Modality Driven LoRA for Adverse Condition Depth Estimation [61.525312117638116]
We propose Multi-Modality Driven LoRA (MMD-LoRA) for Adverse Condition Depth Estimation. It consists of two core components: Prompt Driven Domain Alignment (PDDA) and Visual-Text Consistent Contrastive Learning (VTCCL) It achieves state-of-the-art performance on the nuScenes and Oxford RobotCar datasets.
arXiv Detail & Related papers (2024-12-28T14:23:58Z)
Adversarially Domain-adaptive Latent Diffusion for Unsupervised Semantic Segmentation [7.099012213719071]
This work introduces a semantic segmentation method based on latent diffusion models, termed Inter-Coder Connected Latent Diffusion (ICCLD) ICCLD outperforms state-of-the-art UDA methods, achieving mIoU scores of 74.4 (GTA5$rightarrow$Cityscapes) and 67.2 ( Synthia$rightarrow$Cityscapes)
arXiv Detail & Related papers (2024-12-22T04:55:41Z)
SiamSeg: Self-Training with Contrastive Learning for Unsupervised Domain Adaptation Semantic Segmentation in Remote Sensing [14.007392647145448]
UDA enables models to learn from unlabeled target domain data while training on labeled source domain data. We propose integrating contrastive learning into UDA, enhancing the model's capacity to capture semantic information. Our SimSeg method outperforms existing approaches, achieving state-of-the-art results.
arXiv Detail & Related papers (2024-10-17T11:59:39Z)
Homography Guided Temporal Fusion for Road Line and Marking Segmentation [73.47092021519245]
Road lines and markings are frequently occluded in the presence of moving vehicles, shadow, and glare. We propose a Homography Guided Fusion (HomoFusion) module to exploit temporally-adjacent video frames for complementary cues. We show that exploiting available camera intrinsic data and ground plane assumption for cross-frame correspondence can lead to a light-weight network with significantly improved performances in speed and accuracy.
arXiv Detail & Related papers (2024-04-11T10:26:40Z)
Adaptive Face Recognition Using Adversarial Information Network [57.29464116557734]
Face recognition models often degenerate when training data are different from testing data. We propose a novel adversarial information network (AIN) to address it.
arXiv Detail & Related papers (2023-05-23T02:14:11Z)
Transmission-Guided Bayesian Generative Model for Smoke Segmentation [29.74065829663554]
Deep neural networks are prone to be overconfident for smoke segmentation due to its non-rigid shape and transparent appearance. This is caused by both knowledge level uncertainty due to limited training data for accurate smoke segmentation and labeling level uncertainty representing the difficulty in labeling ground-truth. We introduce a Bayesian generative model to simultaneously estimate the posterior distribution of model parameters and its predictions. We also contribute a high-quality smoke segmentation dataset, SMOKE5K, consisting of 1,400 real and 4,000 synthetic images with pixel-wise annotation.
arXiv Detail & Related papers (2023-03-02T01:48:05Z)
QuadFormer: Quadruple Transformer for Unsupervised Domain Adaptation in Power Line Segmentation of Aerial Images [12.840195641761323]
We propose a novel framework designed for domain adaptive semantic segmentation. The hierarchical quadruple transformer combines cross-attention and self-attention mechanisms to adapt transferable context. We present two datasets - ARPLSyn and ARPLReal - to further advance research in unsupervised domain adaptive powerline segmentation.
arXiv Detail & Related papers (2022-11-29T03:15:27Z)
Domain Adaptive Object Detection for Autonomous Driving under Foggy Weather [25.964194141706923]
This paper proposes a novel domain adaptive object detection framework for autonomous driving under foggy weather. Our method leverages both image-level and object-level adaptation to diminish the domain discrepancy in image style and object appearance. Experimental results on public benchmarks show the effectiveness and accuracy of the proposed method.
arXiv Detail & Related papers (2022-10-27T05:09:10Z)
Source-Free Domain Adaptive Fundus Image Segmentation with Denoised Pseudo-Labeling [56.98020855107174]
Domain adaptation typically requires to access source domain data to utilize their distribution information for domain alignment with the target data. In many real-world scenarios, the source data may not be accessible during the model adaptation in the target domain due to privacy issue. We present a novel denoised pseudo-labeling method for this problem, which effectively makes use of the source model and unlabeled target data.
arXiv Detail & Related papers (2021-09-19T06:38:21Z)
Stagewise Unsupervised Domain Adaptation with Adversarial Self-Training for Road Segmentation of Remote Sensing Images [93.50240389540252]
Road segmentation from remote sensing images is a challenging task with wide ranges of application potentials. We propose a novel stagewise domain adaptation model called RoadDA to address the domain shift (DS) issue in this field. Experiment results on two benchmarks demonstrate that RoadDA can efficiently reduce the domain gap and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-08-28T09:29:14Z)
Joint Visual and Temporal Consistency for Unsupervised Domain Adaptive Person Re-Identification [64.37745443119942]
This paper jointly enforces visual and temporal consistency in the combination of a local one-hot classification and a global multi-class classification. Experimental results on three large-scale ReID datasets demonstrate the superiority of proposed method in both unsupervised and unsupervised domain adaptive ReID tasks.
arXiv Detail & Related papers (2020-07-21T14:31:27Z)
Synthetic-to-Real Domain Adaptation for Lane Detection [5.811502603310248]
We explore learning from abundant, randomly generated synthetic data, together with unlabeled or partially labeled target domain data. This poses the challenge of adapting models learned on the unrealistic synthetic domain to real images. We develop a novel autoencoder-based approach that uses synthetic labels unaligned with particular images for adapting to target domain data.
arXiv Detail & Related papers (2020-07-08T10:54:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.