MICDrop: Masking Image and Depth Features via Complementary Dropout for Domain-Adaptive Semantic Segmentation
- URL: http://arxiv.org/abs/2408.16478v1
- Date: Thu, 29 Aug 2024 12:15:10 GMT
- Title: MICDrop: Masking Image and Depth Features via Complementary Dropout for Domain-Adaptive Semantic Segmentation
- Authors: Linyan Yang, Lukas Hoyer, Mark Weber, Tobias Fischer, Dengxin Dai, Laura Leal-Taixé, Marc Pollefeys, Daniel Cremers, Luc Van Gool,
- Abstract summary: Unsupervised Domain Adaptation (UDA) is the task of bridging the domain gap between a labeled source domain and an unlabeled target domain.
We propose to leverage geometric information, i.e., depth predictions, as depth discontinuities often coincide with segmentation boundaries.
We show that our method can be plugged into various recent UDA methods and consistently improve results across standard UDA benchmarks.
- Score: 155.0797148367653
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised Domain Adaptation (UDA) is the task of bridging the domain gap between a labeled source domain, e.g., synthetic data, and an unlabeled target domain. We observe that current UDA methods show inferior results on fine structures and tend to oversegment objects with ambiguous appearance. To address these shortcomings, we propose to leverage geometric information, i.e., depth predictions, as depth discontinuities often coincide with segmentation boundaries. We show that naively incorporating depth into current UDA methods does not fully exploit the potential of this complementary information. To this end, we present MICDrop, which learns a joint feature representation by masking image encoder features while inversely masking depth encoder features. With this simple yet effective complementary masking strategy, we enforce the use of both modalities when learning the joint feature representation. To aid this process, we propose a feature fusion module to improve both global as well as local information sharing while being robust to errors in the depth predictions. We show that our method can be plugged into various recent UDA methods and consistently improve results across standard UDA benchmarks, obtaining new state-of-the-art performances.
Related papers
- Unified Domain Adaptive Semantic Segmentation [96.74199626935294]
Unsupervised Adaptive Domain Semantic (UDA-SS) aims to transfer the supervision from a labeled source domain to an unlabeled target domain.
We propose a Quad-directional Mixup (QuadMix) method, characterized by tackling distinct point attributes and feature inconsistencies.
Our method outperforms the state-of-the-art works by large margins on four challenging UDA-SS benchmarks.
arXiv Detail & Related papers (2023-11-22T09:18:49Z) - Transferring to Real-World Layouts: A Depth-aware Framework for Scene Adaptation [34.786268652516355]
Scene segmentation via unsupervised domain adaptation (UDA) enables the transfer of knowledge acquired from source synthetic data to real-world target data.
We propose a depth-aware framework to explicitly leverage depth estimation to mix the categories and facilitate the two complementary tasks, i.e., segmentation and depth learning.
In particular, the framework contains a Depth-guided Contextual Filter (DCF) forndata augmentation and a cross-task encoder for contextual learning.
arXiv Detail & Related papers (2023-11-21T15:39:21Z) - I2F: A Unified Image-to-Feature Approach for Domain Adaptive Semantic
Segmentation [55.633859439375044]
Unsupervised domain adaptation (UDA) for semantic segmentation is a promising task freeing people from heavy annotation work.
Key idea to tackle this problem is to perform both image-level and feature-level adaptation jointly.
This paper proposes a novel UDA pipeline for semantic segmentation that unifies image-level and feature-level adaptation.
arXiv Detail & Related papers (2023-01-03T15:19:48Z) - MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation [104.40114562948428]
In unsupervised domain adaptation (UDA), a model trained on source data (e.g. synthetic) is adapted to target data (e.g. real-world) without access to target annotation.
We propose a Masked Image Consistency (MIC) module to enhance UDA by learning spatial context relations of the target domain.
MIC significantly improves the state-of-the-art performance across the different recognition tasks for synthetic-to-real, day-to-nighttime, and clear-to-adverse-weather UDA.
arXiv Detail & Related papers (2022-12-02T17:29:32Z) - CLUDA : Contrastive Learning in Unsupervised Domain Adaptation for
Semantic Segmentation [3.4123736336071864]
CLUDA is a simple, yet novel method for performing unsupervised domain adaptation (UDA) for semantic segmentation.
We extract a multi-level fused-feature map from the encoder, and apply contrastive loss across different classes and different domains.
We produce state-of-the-art results on GTA $rightarrow$ Cityscapes (74.4 mIOU, +0.6) and Synthia $rightarrow$ Cityscapes (67.2 mIOU, +1.4) datasets.
arXiv Detail & Related papers (2022-08-27T05:13:14Z) - Depth-Assisted ResiDualGAN for Cross-Domain Aerial Images Semantic
Segmentation [15.29253551096484]
Unsupervised domain adaptation (UDA) is an approach to minimizing domain gap.
Digital surface model (DSM) is usually available in both the source domain and the target domain.
depth-assisted ResiDualGAN (DRDG) is proposed where depth supervised loss (DCCL) are used to bring depth information into the generative model.
arXiv Detail & Related papers (2022-08-21T06:58:51Z) - Learning Feature Decomposition for Domain Adaptive Monocular Depth
Estimation [51.15061013818216]
Supervised approaches have led to great success with the advance of deep learning, but they rely on large quantities of ground-truth depth annotations.
Unsupervised domain adaptation (UDA) transfers knowledge from labeled source data to unlabeled target data, so as to relax the constraint of supervised learning.
We propose a novel UDA method for MDE, referred to as Learning Feature Decomposition for Adaptation (LFDA), which learns to decompose the feature space into content and style components.
arXiv Detail & Related papers (2022-07-30T08:05:35Z) - Plugging Self-Supervised Monocular Depth into Unsupervised Domain
Adaptation for Semantic Segmentation [19.859764556851434]
We propose to exploit self-supervised monocular depth estimation to improve UDA for semantic segmentation.
Our whole proposal allows for achieving state-of-the-art performance (58.8 mIoU) in the GTA5->CS benchmark benchmark.
arXiv Detail & Related papers (2021-10-13T12:48:51Z) - Domain Adaptive Semantic Segmentation with Self-Supervised Depth
Estimation [84.34227665232281]
Domain adaptation for semantic segmentation aims to improve the model performance in the presence of a distribution shift between source and target domain.
We leverage the guidance from self-supervised depth estimation, which is available on both domains, to bridge the domain gap.
We demonstrate the effectiveness of our proposed approach on the benchmark tasks SYNTHIA-to-Cityscapes and GTA-to-Cityscapes.
arXiv Detail & Related papers (2021-04-28T07:47:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.