Unsupervised Foggy Scene Understanding via Self Spatial-Temporal Label
Diffusion
- URL: http://arxiv.org/abs/2206.04879v1
- Date: Fri, 10 Jun 2022 05:16:50 GMT
- Title: Unsupervised Foggy Scene Understanding via Self Spatial-Temporal Label
Diffusion
- Authors: Liang Liao, Wenyi Chen, Jing Xiao, Zheng Wang, Chia-Wen Lin, Shin'ichi
Satoh
- Abstract summary: We exploit the characteristics of the foggy image sequence of driving scenes to densify the confident pseudo labels.
Based on the two discoveries of local spatial similarity and adjacent temporal correspondence of the sequential image data, we propose a novel Target-Domain driven pseudo label Diffusion scheme.
Our scheme helps the adaptive model achieve 51.92% and 53.84% mean intersection-over-union (mIoU) on two publicly available natural foggy datasets.
- Score: 51.11295961195151
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding foggy image sequence in the driving scenes is critical for
autonomous driving, but it remains a challenging task due to the difficulty in
collecting and annotating real-world images of adverse weather. Recently, the
self-training strategy has been considered a powerful solution for unsupervised
domain adaptation, which iteratively adapts the model from the source domain to
the target domain by generating target pseudo labels and re-training the model.
However, the selection of confident pseudo labels inevitably suffers from the
conflict between sparsity and accuracy, both of which will lead to suboptimal
models. To tackle this problem, we exploit the characteristics of the foggy
image sequence of driving scenes to densify the confident pseudo labels.
Specifically, based on the two discoveries of local spatial similarity and
adjacent temporal correspondence of the sequential image data, we propose a
novel Target-Domain driven pseudo label Diffusion (TDo-Dif) scheme. It employs
superpixels and optical flows to identify the spatial similarity and temporal
correspondence, respectively and then diffuses the confident but sparse pseudo
labels within a superpixel or a temporal corresponding pair linked by the flow.
Moreover, to ensure the feature similarity of the diffused pixels, we introduce
local spatial similarity loss and temporal contrastive loss in the model
re-training stage. Experimental results show that our TDo-Dif scheme helps the
adaptive model achieve 51.92% and 53.84% mean intersection-over-union (mIoU) on
two publicly available natural foggy datasets (Foggy Zurich and Foggy Driving),
which exceeds the state-of-the-art unsupervised domain adaptive semantic
segmentation methods. Models and data can be found at
https://github.com/velor2012/TDo-Dif.
Related papers
- Multi-Modality Driven LoRA for Adverse Condition Depth Estimation [61.525312117638116]
We propose Multi-Modality Driven LoRA (MMD-LoRA) for Adverse Condition Depth Estimation.
It consists of two core components: Prompt Driven Domain Alignment (PDDA) and Visual-Text Consistent Contrastive Learning (VTCCL)
It achieves state-of-the-art performance on the nuScenes and Oxford RobotCar datasets.
arXiv Detail & Related papers (2024-12-28T14:23:58Z) - Homography Guided Temporal Fusion for Road Line and Marking Segmentation [73.47092021519245]
Road lines and markings are frequently occluded in the presence of moving vehicles, shadow, and glare.
We propose a Homography Guided Fusion (HomoFusion) module to exploit temporally-adjacent video frames for complementary cues.
We show that exploiting available camera intrinsic data and ground plane assumption for cross-frame correspondence can lead to a light-weight network with significantly improved performances in speed and accuracy.
arXiv Detail & Related papers (2024-04-11T10:26:40Z) - Adaptive Face Recognition Using Adversarial Information Network [57.29464116557734]
Face recognition models often degenerate when training data are different from testing data.
We propose a novel adversarial information network (AIN) to address it.
arXiv Detail & Related papers (2023-05-23T02:14:11Z) - Transmission-Guided Bayesian Generative Model for Smoke Segmentation [29.74065829663554]
Deep neural networks are prone to be overconfident for smoke segmentation due to its non-rigid shape and transparent appearance.
This is caused by both knowledge level uncertainty due to limited training data for accurate smoke segmentation and labeling level uncertainty representing the difficulty in labeling ground-truth.
We introduce a Bayesian generative model to simultaneously estimate the posterior distribution of model parameters and its predictions.
We also contribute a high-quality smoke segmentation dataset, SMOKE5K, consisting of 1,400 real and 4,000 synthetic images with pixel-wise annotation.
arXiv Detail & Related papers (2023-03-02T01:48:05Z) - QuadFormer: Quadruple Transformer for Unsupervised Domain Adaptation in
Power Line Segmentation of Aerial Images [12.840195641761323]
We propose a novel framework designed for domain adaptive semantic segmentation.
The hierarchical quadruple transformer combines cross-attention and self-attention mechanisms to adapt transferable context.
We present two datasets - ARPLSyn and ARPLReal - to further advance research in unsupervised domain adaptive powerline segmentation.
arXiv Detail & Related papers (2022-11-29T03:15:27Z) - Domain Adaptive Object Detection for Autonomous Driving under Foggy
Weather [25.964194141706923]
This paper proposes a novel domain adaptive object detection framework for autonomous driving under foggy weather.
Our method leverages both image-level and object-level adaptation to diminish the domain discrepancy in image style and object appearance.
Experimental results on public benchmarks show the effectiveness and accuracy of the proposed method.
arXiv Detail & Related papers (2022-10-27T05:09:10Z) - Stagewise Unsupervised Domain Adaptation with Adversarial Self-Training
for Road Segmentation of Remote Sensing Images [93.50240389540252]
Road segmentation from remote sensing images is a challenging task with wide ranges of application potentials.
We propose a novel stagewise domain adaptation model called RoadDA to address the domain shift (DS) issue in this field.
Experiment results on two benchmarks demonstrate that RoadDA can efficiently reduce the domain gap and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-08-28T09:29:14Z) - Joint Visual and Temporal Consistency for Unsupervised Domain Adaptive
Person Re-Identification [64.37745443119942]
This paper jointly enforces visual and temporal consistency in the combination of a local one-hot classification and a global multi-class classification.
Experimental results on three large-scale ReID datasets demonstrate the superiority of proposed method in both unsupervised and unsupervised domain adaptive ReID tasks.
arXiv Detail & Related papers (2020-07-21T14:31:27Z) - Synthetic-to-Real Domain Adaptation for Lane Detection [5.811502603310248]
We explore learning from abundant, randomly generated synthetic data, together with unlabeled or partially labeled target domain data.
This poses the challenge of adapting models learned on the unrealistic synthetic domain to real images.
We develop a novel autoencoder-based approach that uses synthetic labels unaligned with particular images for adapting to target domain data.
arXiv Detail & Related papers (2020-07-08T10:54:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.