Related papers: SemiDAViL: Semi-supervised Domain Adaptation with Vision-Language Guidance for Semantic Segmentation

SemiDAViL: Semi-supervised Domain Adaptation with Vision-Language Guidance for Semantic Segmentation

URL: http://arxiv.org/abs/2504.06389v1
Date: Tue, 08 Apr 2025 19:14:34 GMT
Title: SemiDAViL: Semi-supervised Domain Adaptation with Vision-Language Guidance for Semantic Segmentation
Authors: Hritam Basak, Zhaozheng Yin,
Abstract summary: We propose a language-guided Semi-supervised Domain Adaptation (SSDA) setting for semantic segmentation.<n>We harness the semantic generalization capabilities inherent in vision-language models (VLMs) to establish a synergistic framework.<n>Our approach demonstrates substantial performance improvements over contemporary state-of-the-art (SoTA) methodologies.
Score: 9.311853182451289
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Domain Adaptation (DA) and Semi-supervised Learning (SSL) converge in Semi-supervised Domain Adaptation (SSDA), where the objective is to transfer knowledge from a source domain to a target domain using a combination of limited labeled target samples and abundant unlabeled target data. Although intuitive, a simple amalgamation of DA and SSL is suboptimal in semantic segmentation due to two major reasons: (1) previous methods, while able to learn good segmentation boundaries, are prone to confuse classes with similar visual appearance due to limited supervision; and (2) skewed and imbalanced training data distribution preferring source representation learning whereas impeding from exploring limited information about tailed classes. Language guidance can serve as a pivotal semantic bridge, facilitating robust class discrimination and mitigating visual ambiguities by leveraging the rich semantic relationships encoded in pre-trained language models to enhance feature representations across domains. Therefore, we propose the first language-guided SSDA setting for semantic segmentation in this work. Specifically, we harness the semantic generalization capabilities inherent in vision-language models (VLMs) to establish a synergistic framework within the SSDA paradigm. To address the inherent class-imbalance challenges in long-tailed distributions, we introduce class-balanced segmentation loss formulations that effectively regularize the learning process. Through extensive experimentation across diverse domain adaptation scenarios, our approach demonstrates substantial performance improvements over contemporary state-of-the-art (SoTA) methodologies. Code is available: \href{https://github.com/hritam-98/SemiDAViL}{GitHub}.

Related papers

IGL-DT: Iterative Global-Local Feature Learning with Dual-Teacher Semantic Segmentation Framework under Limited Annotation Scheme [3.440487702095727]
Semi-Supervised Semantic (SSSS) aims to improve segmentation accuracy by leveraging a small set of labeled images alongside a larger pool of unlabeled data. We propose a novel tri-branch semi-supervised segmentation framework incorporating a dual-teacher strategy, named IGL-DT. Our approach employs SwinUnet for high-level semantic guidance through Global Context Learning and ResUnet for detailed feature refinement via Local Regional Learning.
arXiv Detail & Related papers (2025-04-14T01:51:29Z)
VLMs meet UDA: Boosting Transferability of Open Vocabulary Segmentation with Unsupervised Domain Adaptation [3.776249047528669]
This paper proposes enhancing segmentation accuracy across diverse domains by integrating Vision-Language reasoning with key strategies for Unsupervised Domain Adaptation (UDA)<n>We improve the fine-grained segmentation capabilities of VLMs through multi-scale contextual data, robust text embeddings with prompt augmentation, and layer-wise fine-tuning in our proposed Foundational-Retaining Open Vocabulary (FROVSS) framework.<n>The resulting UDA-FROV framework is the first UDA approach to effectively adapt across domains without requiring shared categories.
arXiv Detail & Related papers (2024-12-12T12:49:42Z)
Pulling Target to Source: A New Perspective on Domain Adaptive Semantic Segmentation [80.1412989006262]
Domain adaptive semantic segmentation aims to transfer knowledge from a labeled source domain to an unlabeled target domain. We propose T2S-DA, which we interpret as a form of pulling Target to Source for Domain Adaptation.
arXiv Detail & Related papers (2023-05-23T07:09:09Z)
Generalized Semantic Segmentation by Self-Supervised Source Domain Projection and Multi-Level Contrastive Learning [79.0660895390689]
Deep networks trained on the source domain show degraded performance when tested on unseen target domain data. We propose a Domain Projection and Contrastive Learning (DPCL) approach for generalized semantic segmentation.
arXiv Detail & Related papers (2023-03-03T13:07:14Z)
Transferrable Contrastive Learning for Visual Domain Adaptation [108.98041306507372]
Transferrable Contrastive Learning (TCL) is a self-supervised learning paradigm tailored for domain adaptation. TCL penalizes cross-domain intra-class domain discrepancy between source and target through a clean and novel contrastive loss. The free lunch is, thanks to the incorporation of contrastive learning, TCL relies on a moving-averaged key encoder that naturally achieves a temporally ensembled version of pseudo labels for target data.
arXiv Detail & Related papers (2021-12-14T16:23:01Z)
Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains. We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z)
Semi-supervised Domain Adaptation for Semantic Segmentation [3.946367634483361]
We propose a novel two-step semi-supervised dual-domain adaptation (SSDDA) approach to address both cross- and intra-domain gaps in semantic segmentation. We demonstrate that the proposed approach outperforms state-of-the-art methods on two common synthetic-to-real semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-20T16:13:00Z)
HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning [74.76431541169342]
Zero-shot learning (ZSL) tackles the unseen class recognition problem, transferring semantic knowledge from seen classes to unseen ones. We propose a novel hierarchical semantic-visual adaptation (HSVA) framework to align semantic and visual domains. Experiments on four benchmark datasets demonstrate HSVA achieves superior performance on both conventional and generalized ZSL.
arXiv Detail & Related papers (2021-09-30T14:27:50Z)
Unsupervised Domain Adaptation for Semantic Segmentation via Low-level Edge Information Transfer [27.64947077788111]
Unsupervised domain adaptation for semantic segmentation aims to make models trained on synthetic data adapt to real images. Previous feature-level adversarial learning methods only consider adapting models on the high-level semantic features. We present the first attempt at explicitly using low-level edge information, which has a small inter-domain gap, to guide the transfer of semantic information.
arXiv Detail & Related papers (2021-09-18T11:51:31Z)
A Curriculum-style Self-training Approach for Source-Free Semantic Segmentation [91.13472029666312]
We propose a curriculum-style self-training approach for source-free domain adaptive semantic segmentation. Our method yields state-of-the-art performance on source-free semantic segmentation tasks for both synthetic-to-real and adverse conditions.
arXiv Detail & Related papers (2021-06-22T10:21:39Z)
Alleviating Semantic-level Shift: A Semi-supervised Domain Adaptation Method for Semantic Segmentation [97.8552697905657]
A key challenge of this task is how to alleviate the data distribution discrepancy between the source and target domains. We propose Alleviating Semantic-level Shift (ASS), which can successfully promote the distribution consistency from both global and local views. We apply our ASS to two domain adaptation tasks, from GTA5 to Cityscapes and from Synthia to Cityscapes.
arXiv Detail & Related papers (2020-04-02T03:25:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.