Related papers: Bridging Granularity Gaps: Hierarchical Semantic Learning for Cross-domain Few-shot Segmentation

Bridging Granularity Gaps: Hierarchical Semantic Learning for Cross-domain Few-shot Segmentation

URL: http://arxiv.org/abs/2511.12200v1
Date: Sat, 15 Nov 2025 13:13:26 GMT
Title: Bridging Granularity Gaps: Hierarchical Semantic Learning for Cross-domain Few-shot Segmentation
Authors: Sujun Sun, Haowen Gu, Cheng Xie, Yanxu Ren, Mingwu Ren, Haofeng Zhang,
Abstract summary: Cross-domain Few-shot (CD-FSS) aims to segment novel classes from target domains that are not involved in training.<n>We propose a Hierarchical Semantic Learning framework to tackle this problem.
Score: 10.442470344182993
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Cross-domain Few-shot Segmentation (CD-FSS) aims to segment novel classes from target domains that are not involved in training and have significantly different data distributions from the source domain, using only a few annotated samples, and recent years have witnessed significant progress on this task. However, existing CD-FSS methods primarily focus on style gaps between source and target domains while ignoring segmentation granularity gaps, resulting in insufficient semantic discriminability for novel classes in target domains. Therefore, we propose a Hierarchical Semantic Learning (HSL) framework to tackle this problem. Specifically, we introduce a Dual Style Randomization (DSR) module and a Hierarchical Semantic Mining (HSM) module to learn hierarchical semantic features, thereby enhancing the model's ability to recognize semantics at varying granularities. DSR simulates target domain data with diverse foreground-background style differences and overall style variations through foreground and global style randomization respectively, while HSM leverages multi-scale superpixels to guide the model to mine intra-class consistency and inter-class distinction at different granularities. Additionally, we also propose a Prototype Confidence-modulated Thresholding (PCMT) module to mitigate segmentation ambiguity when foreground and background are excessively similar. Extensive experiments are conducted on four popular target domain datasets, and the results demonstrate that our method achieves state-of-the-art performance.

Related papers

Exploring Semantic Consistency and Style Diversity for Domain Generalized Semantic Segmentation [4.850207292777464]
Domain Generalized Semantic aims to enhance the generalization of semantic segmentation across unknown target domains.<n>We introduce SCSD for Semantic Consistency prediction and Style Diversity generalization.<n>SCSD significantly outperforms existing state-of-theart methods.
arXiv Detail & Related papers (2024-12-16T18:20:06Z)
Multisource Collaborative Domain Generalization for Cross-Scene Remote Sensing Image Classification [57.945437355714155]
Cross-scene image classification aims to transfer prior knowledge of ground materials to annotate regions with different distributions.<n>Existing approaches focus on single-source domain generalization to unseen target domains.<n>We propose a novel multi-source collaborative domain generalization framework (MS-CDG) based on homogeneity and heterogeneity characteristics of multi-source remote sensing data.
arXiv Detail & Related papers (2024-12-05T06:15:08Z)
ProtoGMM: Multi-prototype Gaussian-Mixture-based Domain Adaptation Model for Semantic Segmentation [0.8213829427624407]
Domain adaptive semantic segmentation aims to generate accurate and dense predictions for an unlabeled target domain. We propose the ProtoGMM model, which incorporates the GMM into contrastive losses to perform guided contrastive learning. To achieve increased intra-class semantic similarity, decreased inter-class similarity, and domain alignment between the source and target domains, we employ multi-prototype contrastive learning.
arXiv Detail & Related papers (2024-06-27T14:50:50Z)
StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization [85.18995948334592]
Single domain generalization (single DG) aims at learning a robust model generalizable to unseen domains from only one training domain. State-of-the-art approaches have mostly relied on data augmentations, such as adversarial perturbation and style enhancement, to synthesize new data. We propose emphStyDeSty, which explicitly accounts for the alignment of the source and pseudo domains in the process of data augmentation.
arXiv Detail & Related papers (2024-06-01T02:41:34Z)
Generalized Semantic Segmentation by Self-Supervised Source Domain Projection and Multi-Level Contrastive Learning [79.0660895390689]
Deep networks trained on the source domain show degraded performance when tested on unseen target domain data. We propose a Domain Projection and Contrastive Learning (DPCL) approach for generalized semantic segmentation.
arXiv Detail & Related papers (2023-03-03T13:07:14Z)
High-level semantic feature matters few-shot unsupervised domain adaptation [15.12545632709954]
We propose a novel task-specific semantic feature learning method (TSECS) for FS-UDA. TSECS learns high-level semantic features for image-to-class similarity measurement. We show that the proposed method significantly outperforms SOTA methods in FS-UDA by a large margin.
arXiv Detail & Related papers (2023-01-05T08:39:52Z)
Distribution Regularized Self-Supervised Learning for Domain Adaptation of Semantic Segmentation [3.284878354988896]
This paper proposes a pixel-level distribution regularization scheme (DRSL) for self-supervised domain adaptation of semantic segmentation. In a typical setting, the classification loss forces the semantic segmentation model to greedily learn the representations that capture inter-class variations. We capture pixel-level intra-class variations through class-aware multi-modal distribution learning.
arXiv Detail & Related papers (2022-06-20T09:52:49Z)
Consistency and Diversity induced Human Motion Segmentation [231.36289425663702]
We propose a novel Consistency and Diversity induced human Motion (CDMS) algorithm. Our model factorizes the source and target data into distinct multi-layer feature spaces. A multi-mutual learning strategy is carried out to reduce the domain gap between the source and target data.
arXiv Detail & Related papers (2022-02-10T06:23:56Z)
Cluster, Split, Fuse, and Update: Meta-Learning for Open Compound Domain Adaptive Semantic Segmentation [102.42638795864178]
We propose a principled meta-learning based approach to OCDA for semantic segmentation. We cluster target domain into multiple sub-target domains by image styles, extracted in an unsupervised manner. A meta-learner is thereafter deployed to learn to fuse sub-target domain-specific predictions, conditioned upon the style code. We learn to online update the model by model-agnostic meta-learning (MAML) algorithm, thus to further improve generalization.
arXiv Detail & Related papers (2020-12-15T13:21:54Z)
MADAN: Multi-source Adversarial Domain Aggregation Network for Domain Adaptation [58.38749495295393]
Domain adaptation aims to learn a transferable model to bridge the domain shift between one labeled source domain and another sparsely labeled or unlabeled target domain. Recent multi-source domain adaptation (MDA) methods do not consider the pixel-level alignment between sources and target. We propose a novel MDA framework to address these challenges.
arXiv Detail & Related papers (2020-02-19T21:22:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.