LangDA: Building Context-Awareness via Language for Domain Adaptive Semantic Segmentation
- URL: http://arxiv.org/abs/2503.12780v1
- Date: Mon, 17 Mar 2025 03:33:28 GMT
- Title: LangDA: Building Context-Awareness via Language for Domain Adaptive Semantic Segmentation
- Authors: Chang Liu, Bavesh Balaji, Saad Hossain, C Thomas, Kwei-Herng Lai, Raviteja Vemulapalli, Alexander Wong, Sirisha Rambhatla,
- Abstract summary: Unsupervised domain adaptation for semantic segmentation aims to transfer knowledge from a label-rich source domain to a target domain with no labels.<n>LangDA addresses these challenges by learning contextual relationships between objects via VLM-generated scene descriptions.<n>LangDA sets the new state-of-the-art across three DASS benchmarks, outperforming existing methods by 2.6%, 1.4% and 3.9%.
- Score: 69.13257545389781
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Unsupervised domain adaptation for semantic segmentation (DASS) aims to transfer knowledge from a label-rich source domain to a target domain with no labels. Two key approaches in DASS are (1) vision-only approaches using masking or multi-resolution crops, and (2) language-based approaches that use generic class-wise prompts informed by target domain (e.g. "a {snowy} photo of a {class}"). However, the former is susceptible to noisy pseudo-labels that are biased to the source domain. The latter does not fully capture the intricate spatial relationships of objects -- key for dense prediction tasks. To this end, we propose LangDA. LangDA addresses these challenges by, first, learning contextual relationships between objects via VLM-generated scene descriptions (e.g. "a pedestrian is on the sidewalk, and the street is lined with buildings."). Second, LangDA aligns the entire image features with text representation of this context-aware scene caption and learns generalized representations via text. With this, LangDA sets the new state-of-the-art across three DASS benchmarks, outperforming existing methods by 2.6%, 1.4% and 3.9%.
Related papers
- MoDA: Leveraging Motion Priors from Videos for Advancing Unsupervised Domain Adaptation in Semantic Segmentation [61.4598392934287]
This study introduces a different UDA scenarios where the target domain contains unlabeled video frames.
We design a textbfMotion-guided textbfDomain textbfAdaptive semantic segmentation framework (MoDA)
MoDA harnesses the self-supervised object motion cues to facilitate cross-domain alignment for segmentation task.
arXiv Detail & Related papers (2023-09-21T01:31:54Z) - Dual-level Interaction for Domain Adaptive Semantic Segmentation [0.0]
We propose a dual-level interaction for domain adaptation (DIDA) in semantic segmentation.
Explicitly, we encourage the different augmented views of the same pixel to have similar class prediction.
Our method outperforms the state-of-the-art by a notable margin, especially on confusing and long-tailed classes.
arXiv Detail & Related papers (2023-07-16T07:51:18Z) - Pulling Target to Source: A New Perspective on Domain Adaptive Semantic Segmentation [80.1412989006262]
Domain adaptive semantic segmentation aims to transfer knowledge from a labeled source domain to an unlabeled target domain.
We propose T2S-DA, which we interpret as a form of pulling Target to Source for Domain Adaptation.
arXiv Detail & Related papers (2023-05-23T07:09:09Z) - A Two-Stage Framework with Self-Supervised Distillation For Cross-Domain Text Classification [46.47734465505251]
Cross-domain text classification aims to adapt models to a target domain that lacks labeled data.
We propose a two-stage framework for cross-domain text classification.
arXiv Detail & Related papers (2023-04-18T06:21:40Z) - I2F: A Unified Image-to-Feature Approach for Domain Adaptive Semantic
Segmentation [55.633859439375044]
Unsupervised domain adaptation (UDA) for semantic segmentation is a promising task freeing people from heavy annotation work.
Key idea to tackle this problem is to perform both image-level and feature-level adaptation jointly.
This paper proposes a novel UDA pipeline for semantic segmentation that unifies image-level and feature-level adaptation.
arXiv Detail & Related papers (2023-01-03T15:19:48Z) - HYLDA: End-to-end Hybrid Learning Domain Adaptation for LiDAR Semantic
Segmentation [13.87939140266266]
This paper addresses the problem of training a LiDAR semantic segmentation network using a fully-labeled source dataset and a target dataset that only has a small number of labels.
We develop a novel image-to-image translation engine, and couple it with a LiDAR semantic segmentation network, resulting in an integrated domain adaptation architecture we call HYLDA.
arXiv Detail & Related papers (2022-01-14T18:13:09Z) - LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic
Segmentation [7.629717457706323]
LoveDA dataset contains 5987 HSR images with 166 annotated objects from three different cities.
LoveDA dataset is suitable for both land-cover semantic segmentation and unsupervised domain adaptation (UDA) tasks.
arXiv Detail & Related papers (2021-10-17T06:12:48Z) - Discover, Hallucinate, and Adapt: Open Compound Domain Adaptation for
Semantic Segmentation [91.30558794056056]
Unsupervised domain adaptation (UDA) for semantic segmentation has been attracting attention recently.
We present a novel framework based on three main design principles: discover, hallucinate, and adapt.
We evaluate our solution on standard benchmark GTA to C-driving, and achieved new state-of-the-art results.
arXiv Detail & Related papers (2021-10-08T13:20:09Z) - mDALU: Multi-Source Domain Adaptation and Label Unification with Partial
Datasets [102.62639692656458]
This paper treats this task as a multi-source domain adaptation and label unification problem.
Our method consists of a partially-supervised adaptation stage and a fully-supervised adaptation stage.
We verify the method on three different tasks, image classification, 2D semantic image segmentation, and joint 2D-3D semantic segmentation.
arXiv Detail & Related papers (2020-12-15T15:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.