Condition-Invariant Semantic Segmentation
- URL: http://arxiv.org/abs/2305.17349v3
- Date: Mon, 22 Jul 2024 16:12:50 GMT
- Title: Condition-Invariant Semantic Segmentation
- Authors: Christos Sakaridis, David Bruggemann, Fisher Yu, Luc Van Gool,
- Abstract summary: We implement Condition-Invariant Semantic (CISS) on the current state-of-the-art domain adaptation architecture.
Our method achieves the second-best performance on the normal-to-adverse Cityscapes$to$ACDC benchmark.
CISS is shown to generalize well to domains unseen during training, such as BDD100K-night and ACDC-night.
- Score: 77.10045325743644
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adaptation of semantic segmentation networks to different visual conditions is vital for robust perception in autonomous cars and robots. However, previous work has shown that most feature-level adaptation methods, which employ adversarial training and are validated on synthetic-to-real adaptation, provide marginal gains in condition-level adaptation, being outperformed by simple pixel-level adaptation via stylization. Motivated by these findings, we propose to leverage stylization in performing feature-level adaptation by aligning the internal network features extracted by the encoder of the network from the original and the stylized view of each input image with a novel feature invariance loss. In this way, we encourage the encoder to extract features that are already invariant to the style of the input, allowing the decoder to focus on parsing these features and not on further abstracting from the specific style of the input. We implement our method, named Condition-Invariant Semantic Segmentation (CISS), on the current state-of-the-art domain adaptation architecture and achieve outstanding results on condition-level adaptation. In particular, CISS sets the new state of the art in the popular daytime-to-nighttime Cityscapes$\to$Dark Zurich benchmark. Furthermore, our method achieves the second-best performance on the normal-to-adverse Cityscapes$\to$ACDC benchmark. CISS is shown to generalize well to domains unseen during training, such as BDD100K-night and ACDC-night. Code is publicly available at https://github.com/SysCV/CISS .
Related papers
- Physically Feasible Semantic Segmentation [58.17907376475596]
State-of-the-art semantic segmentation models are typically optimized in a data-driven fashion.
Our method, Physically Feasible Semantic (PhyFea), extracts explicit physical constraints that govern spatial class relations.
PhyFea yields significant performance improvements in mIoU over each state-of-the-art network we use.
arXiv Detail & Related papers (2024-08-26T22:39:08Z) - Intra- & Extra-Source Exemplar-Based Style Synthesis for Improved Domain
Generalization [21.591831983223997]
We propose an exemplar-based style synthesis pipeline to improve domain generalization in semantic segmentation.
Our method is based on a novel masked noise encoder for StyleGAN2 inversion.
We achieve up to $12.4%$ mIoU improvements on driving-scene semantic segmentation under different types of data shifts.
arXiv Detail & Related papers (2023-07-02T19:56:43Z) - Contrastive Model Adaptation for Cross-Condition Robustness in Semantic
Segmentation [58.17907376475596]
We investigate normal-to-adverse condition model adaptation for semantic segmentation.
Our method -- CMA -- leverages such image pairs to learn condition-invariant features via contrastive learning.
We achieve state-of-the-art semantic segmentation performance for model adaptation on several normal-to-adverse adaptation benchmarks.
arXiv Detail & Related papers (2023-03-09T11:48:29Z) - P{\O}DA: Prompt-driven Zero-shot Domain Adaptation [27.524962843495366]
We adapt a model trained on a source domain using only a general description in natural language of the target domain, i.e., a prompt.
We show that these prompt-driven augmentations can be used to perform zero-shot domain adaptation for semantic segmentation.
arXiv Detail & Related papers (2022-12-06T18:59:58Z) - Refign: Align and Refine for Adaptation of Semantic Segmentation to
Adverse Conditions [78.71745819446176]
Refign is a generic extension to self-training-based UDA methods which leverages cross-domain correspondences.
Refign consists of two steps: (1) aligning the normal-condition image to the corresponding adverse-condition image using an uncertainty-aware dense matching network, and (2) refining the adverse prediction with the normal prediction using an adaptive label correction mechanism.
The approach introduces no extra training parameters, minimal computational overhead -- during training only -- and can be used as a drop-in extension to improve any given self-training-based UDA method.
arXiv Detail & Related papers (2022-07-14T11:30:38Z) - DGSS : Domain Generalized Semantic Segmentation using Iterative Style
Mining and Latent Representation Alignment [38.05196030226661]
Current state-of-the-art (SoTA) have proposed different mechanisms to bridge the domain gap, but they still perform poorly in low illumination conditions.
We propose a two-step framework wherein we first identify an adversarial style that maximizes the domain gap between stylized and source images.
We then propose a style mixing mechanism wherein the same objects from different styles are mixed to construct a new training image.
arXiv Detail & Related papers (2022-02-26T13:54:57Z) - HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning [74.76431541169342]
Zero-shot learning (ZSL) tackles the unseen class recognition problem, transferring semantic knowledge from seen classes to unseen ones.
We propose a novel hierarchical semantic-visual adaptation (HSVA) framework to align semantic and visual domains.
Experiments on four benchmark datasets demonstrate HSVA achieves superior performance on both conventional and generalized ZSL.
arXiv Detail & Related papers (2021-09-30T14:27:50Z) - Self-supervised Augmentation Consistency for Adapting Semantic
Segmentation [56.91850268635183]
We propose an approach to domain adaptation for semantic segmentation that is both practical and highly accurate.
We employ standard data augmentation techniques $-$ photometric noise, flipping and scaling $-$ and ensure consistency of the semantic predictions.
We achieve significant improvements of the state-of-the-art segmentation accuracy after adaptation, consistent both across different choices of the backbone architecture and adaptation scenarios.
arXiv Detail & Related papers (2021-04-30T21:32:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.