Related papers: DGMamba: Domain Generalization via Generalized State Space Model

DGMamba: Domain Generalization via Generalized State Space Model

URL: http://arxiv.org/abs/2404.07794v2
Date: Thu, 9 May 2024 13:30:37 GMT
Title: DGMamba: Domain Generalization via Generalized State Space Model
Authors: Shaocong Long, Qianyu Zhou, Xiangtai Li, Xuequan Lu, Chenhao Ying, Yuan Luo, Lizhuang Ma, Shuicheng Yan,
Abstract summary: Domain generalization(DG) aims at solving distribution shift problems in various scenes. Mamba, as an emerging state space model (SSM), possesses superior linear complexity and global receptive fields. We propose a novel framework for DG, named DGMamba, that excels in strong generalizability toward unseen domains.
Score: 80.82253601531164
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Domain generalization~(DG) aims at solving distribution shift problems in various scenes. Existing approaches are based on Convolution Neural Networks (CNNs) or Vision Transformers (ViTs), which suffer from limited receptive fields or quadratic complexities issues. Mamba, as an emerging state space model (SSM), possesses superior linear complexity and global receptive fields. Despite this, it can hardly be applied to DG to address distribution shifts, due to the hidden state issues and inappropriate scan mechanisms. In this paper, we propose a novel framework for DG, named DGMamba, that excels in strong generalizability toward unseen domains and meanwhile has the advantages of global receptive fields, and efficient linear complexity. Our DGMamba compromises two core components: Hidden State Suppressing~(HSS) and Semantic-aware Patch refining~(SPR). In particular, HSS is introduced to mitigate the influence of hidden states associated with domain-specific features during output prediction. SPR strives to encourage the model to concentrate more on objects rather than context, consisting of two designs: Prior-Free Scanning~(PFS), and Domain Context Interchange~(DCI). Concretely, PFS aims to shuffle the non-semantic patches within images, creating more flexible and effective sequences from images, and DCI is designed to regularize Mamba with the combination of mismatched non-semantic and semantic information by fusing patches among domains. Extensive experiments on four commonly used DG benchmarks demonstrate that the proposed DGMamba achieves remarkably superior results to state-of-the-art models. The code will be made publicly available.

Related papers

Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shifts [56.57141696245328]
In open-world scenarios, where both novel classes and domains may exist, an ideal segmentation model should detect anomaly classes for safety. Existing methods often struggle to distinguish between domain-level and semantic-level distribution shifts.
arXiv Detail & Related papers (2024-11-06T11:03:02Z)
START: A Generalized State Space Model with Saliency-Driven Token-Aware Transformation [27.301312891532277]
Domain Generalization (DG) aims to enable models to generalize to unseen target domains by learning from multiple source domains. We propose START, which achieves state-of-the-art (SOTA) performances and offers a competitive alternative to CNNs and ViTs. Our START can selectively perturb and suppress domain-specific features in salient tokens within the input-dependent matrices of SSMs, thus effectively reducing the discrepancy between different domains.
arXiv Detail & Related papers (2024-10-21T13:50:32Z)
PointDGMamba: Domain Generalization of Point Cloud Classification via Generalized State Space Model [77.00221501105788]
Domain Generalization (DG) has been recently explored to improve the generalizability of point cloud classification (PCC) models toward unseen domains. We present the first work that studies the generalizability of state space models (SSMs) in DG PCC. We propose a novel framework, PointDGMamba, that excels in strong generalizability toward unseen domains.
arXiv Detail & Related papers (2024-08-24T12:53:48Z)
Disentangling Masked Autoencoders for Unsupervised Domain Generalization [57.56744870106124]
Unsupervised domain generalization is fast gaining attention but is still far from well-studied. Disentangled Masked Auto (DisMAE) aims to discover the disentangled representations that faithfully reveal intrinsic features. DisMAE co-trains the asymmetric dual-branch architecture with semantic and lightweight variation encoders.
arXiv Detail & Related papers (2024-07-10T11:11:36Z)
Semantic-Aware Domain Generalized Segmentation [67.49163582961877]
Deep models trained on source domain lack generalization when evaluated on unseen target domains with different data distributions. We propose a framework including two novel modules: Semantic-Aware Normalization (SAN) and Semantic-Aware Whitening (SAW) Our approach shows significant improvements over existing state-of-the-art on various backbone networks.
arXiv Detail & Related papers (2022-04-02T09:09:59Z)
Compound Domain Generalization via Meta-Knowledge Encoding [55.22920476224671]
We introduce Style-induced Domain-specific Normalization (SDNorm) to re-normalize the multi-modal underlying distributions. We harness the prototype representations, the centroids of classes, to perform relational modeling in the embedding space. Experiments on four standard Domain Generalization benchmarks reveal that COMEN exceeds the state-of-the-art performance without the need of domain supervision.
arXiv Detail & Related papers (2022-03-24T11:54:59Z)
SAND-mask: An Enhanced Gradient Masking Strategy for the Discovery of Invariances in Domain Generalization [7.253255826783766]
We propose a masking strategy, which determines a continuous weight based on the agreement of gradients that flow in each edge of network. SAND-mask is validated over the Domainbed benchmark for domain generalization.
arXiv Detail & Related papers (2021-06-04T05:20:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.