Target-Oriented Single Domain Generalization
- URL: http://arxiv.org/abs/2509.00351v1
- Date: Sat, 30 Aug 2025 04:21:48 GMT
- Title: Target-Oriented Single Domain Generalization
- Authors: Marzi Heidari, Yuhong Guo,
- Abstract summary: Deep models trained on a single source domain often fail catastrophically under distribution shifts.<n>We propose Target-Oriented Single Domain Generalization, a novel problem setup that leverages the textual description of the target domain.<n>We introduce Spectral TARget Alignment (STAR), a module that injects target semantics into source features.
- Score: 27.182037614828968
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep models trained on a single source domain often fail catastrophically under distribution shifts, a critical challenge in Single Domain Generalization (SDG). While existing methods focus on augmenting source data or learning invariant features, they neglect a readily available resource: textual descriptions of the target deployment environment. We propose Target-Oriented Single Domain Generalization (TO-SDG), a novel problem setup that leverages the textual description of the target domain, without requiring any target data, to guide model generalization. To address TO-SDG, we introduce Spectral TARget Alignment (STAR), a lightweight module that injects target semantics into source features by exploiting visual-language models (VLMs) such as CLIP. STAR uses a target-anchored subspace derived from the text embedding of the target description to recenter image features toward the deployment domain, then utilizes spectral projection to retain directions aligned with target cues while discarding source-specific noise. Moreover, we use a vision-language distillation to align backbone features with VLM's semantic geometry. STAR further employs feature-space Mixup to ensure smooth transitions between source and target-oriented representations. Experiments across various image classification and object detection benchmarks demonstrate STAR's superiority. This work establishes that minimal textual metadata, which is a practical and often overlooked resource, significantly enhances generalization under severe data constraints, opening new avenues for deploying robust models in target environments with unseen data.
Related papers
- A Turn Toward Better Alignment: Few-Shot Generative Adaptation with Equivariant Feature Rotation [67.2019317630466]
Few-shot image generation aims to effectively adapt a source generative model to a target domain using very few training images.<n>We propose Equivariant Feature Rotation (EFR), a novel adaptation strategy that aligns source and target domains at two complementary levels.<n>Our method significantly enhances the generative performance within the targeted domain.
arXiv Detail & Related papers (2025-12-24T13:48:22Z) - Unbiased Semantic Decoding with Vision Foundation Models for Few-shot Segmentation [36.731980769369834]
We propose an Unbiased Semantic Decoding (USD) strategy integrated with Segment Anything Model (SAM)<n>USD strategy extracts target information from both the support and query set simultaneously to perform consistent predictions.<n>To generate target-focused prompt embeddings, a learnable visual-text target prompt generator is proposed.
arXiv Detail & Related papers (2025-11-19T04:41:43Z) - Object Style Diffusion for Generalized Object Detection in Urban Scene [69.04189353993907]
We introduce a novel single-domain object detection generalization method, named GoDiff.<n>By integrating pseudo-target domain data with source domain data, we diversify the training dataset.<n> Experimental results demonstrate that our method not only enhances the generalization ability of existing detectors but also functions as a plug-and-play enhancement for other single-domain generalization methods.
arXiv Detail & Related papers (2024-12-18T13:03:00Z) - GlocalCLIP: Object-agnostic Global-Local Prompt Learning for Zero-shot Anomaly Detection [5.530212768657544]
We introduce glocal contrastive learning to improve the complementary learning of global and local prompts.<n>The generalization performance of GlocalCLIP in ZSAD was demonstrated on 15 real-world datasets.
arXiv Detail & Related papers (2024-11-09T05:22:13Z) - Boosting Weakly-Supervised Referring Image Segmentation via Progressive Comprehension [40.21084218601082]
This paper focuses on a challenging setup where target localization is learned directly from image-text pairs.<n>We propose a novel Progressive Network (PCNet) to leverage target-related textual cues for progressively localizing the target object.<n>Our method outperforms SOTA methods on three common benchmarks.
arXiv Detail & Related papers (2024-10-02T13:30:32Z) - Phrase Grounding-based Style Transfer for Single-Domain Generalized
Object Detection [109.58348694132091]
Single-domain generalized object detection aims to enhance a model's generalizability to multiple unseen target domains.
This is a practical yet challenging task as it requires the model to address domain shift without incorporating target domain data into training.
We propose a novel phrase grounding-based style transfer approach for the task.
arXiv Detail & Related papers (2024-02-02T10:48:43Z) - Adaptive Semantic Consistency for Cross-domain Few-shot Classification [27.176106714652327]
Cross-domain few-shot classification (CD-FSC) aims to identify novel target classes with a few samples.
We propose a simple plug-and-play Adaptive Semantic Consistency framework, which improves cross-domain robustness.
The proposed ASC enables explicit transfer of source domain knowledge to prevent the model from overfitting the target domain.
arXiv Detail & Related papers (2023-08-01T15:37:19Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - CLIP the Gap: A Single Domain Generalization Approach for Object
Detection [60.20931827772482]
Single Domain Generalization tackles the problem of training a model on a single source domain so that it generalizes to any unseen target domain.
We propose to leverage a pre-trained vision-language model to introduce semantic domain concepts via textual prompts.
We achieve this via a semantic augmentation strategy acting on the features extracted by the detector backbone, as well as a text-based classification loss.
arXiv Detail & Related papers (2023-01-13T12:01:18Z) - Instance Relation Graph Guided Source-Free Domain Adaptive Object
Detection [79.89082006155135]
Unsupervised Domain Adaptation (UDA) is an effective approach to tackle the issue of domain shift.
UDA methods try to align the source and target representations to improve the generalization on the target domain.
The Source-Free Adaptation Domain (SFDA) setting aims to alleviate these concerns by adapting a source-trained model for the target domain without requiring access to the source data.
arXiv Detail & Related papers (2022-03-29T17:50:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.