Related papers: Towards Single-Source Domain Generalized Object Detection via Causal Visual Prompts

Towards Single-Source Domain Generalized Object Detection via Causal Visual Prompts

URL: http://arxiv.org/abs/2510.19487v1
Date: Wed, 22 Oct 2025 11:24:52 GMT
Title: Towards Single-Source Domain Generalized Object Detection via Causal Visual Prompts
Authors: Chen Li, Huiying Xu, Changxin Gao, Zeyu Wang, Yun Liu, Xinzhong Zhu,
Abstract summary: Single-source Domain Generalized Object Detection is a cutting-edge research topic in computer vision.<n>Causal Visual Prompts method mitigates bias from spurious features by integrating visual prompts with cross-attention.<n>Causal achieves state-of-the-art performance with 15.9-31.4% gains over existing domain generalization methods.
Score: 37.886574666175065
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Single-source Domain Generalized Object Detection (SDGOD), as a cutting-edge research topic in computer vision, aims to enhance model generalization capability in unseen target domains through single-source domain training. Current mainstream approaches attempt to mitigate domain discrepancies via data augmentation techniques. However, due to domain shift and limited domain-specific knowledge, models tend to fall into the pitfall of spurious correlations. This manifests as the model's over-reliance on simplistic classification features (e.g., color) rather than essential domain-invariant representations like object contours. To address this critical challenge, we propose the Cauvis (Causal Visual Prompts) method. First, we introduce a Cross-Attention Prompts module that mitigates bias from spurious features by integrating visual prompts with cross-attention. To address the inadequate domain knowledge coverage and spurious feature entanglement in visual prompts for single-domain generalization, we propose a dual-branch adapter that disentangles causal-spurious features while achieving domain adaptation via high-frequency feature extraction. Cauvis achieves state-of-the-art performance with 15.9-31.4% gains over existing domain generalization methods on SDGOD datasets, while exhibiting significant robustness advantages in complex interference environments.

Related papers

Boosting Domain Generalized and Adaptive Detection with Diffusion Models: Fitness, Generalization, and Transferability [0.0]
Detectors often suffer from performance drop due to domain gap between training and testing data.<n>Recent methods explore diffusion models applied to domain generalization (DG) and adaptation (DA) tasks.<n>We propose to tackle these problems by extracting intermediate features from a single-step diffusion process.
arXiv Detail & Related papers (2025-06-26T06:42:23Z)
Object Style Diffusion for Generalized Object Detection in Urban Scene [69.04189353993907]
We introduce a novel single-domain object detection generalization method, named GoDiff.<n>By integrating pseudo-target domain data with source domain data, we diversify the training dataset.<n> Experimental results demonstrate that our method not only enhances the generalization ability of existing detectors but also functions as a plug-and-play enhancement for other single-domain generalization methods.
arXiv Detail & Related papers (2024-12-18T13:03:00Z)
Boundless Across Domains: A New Paradigm of Adaptive Feature and Cross-Attention for Domain Generalization in Medical Image Segmentation [1.93061220186624]
Domain-invariant representation learning is a powerful method for domain generalization. Previous approaches face challenges such as high computational demands, training instability, and limited effectiveness with high-dimensional data. We propose an Adaptive Feature Blending (AFB) method that generates out-of-distribution samples while exploring the in-distribution space.
arXiv Detail & Related papers (2024-11-22T12:06:24Z)
Unified Domain Adaptive Semantic Segmentation [105.05235403072021]
Unsupervised Adaptive Domain Semantic (UDA-SS) aims to transfer the supervision from a labeled source domain to an unlabeled target domain.<n>We propose a Quad-directional Mixup (QuadMix) method, characterized by tackling distinct point attributes and feature inconsistencies.<n>Our method outperforms the state-of-the-art works by large margins on four challenging UDA-SS benchmarks.
arXiv Detail & Related papers (2023-11-22T09:18:49Z)
CLIP the Gap: A Single Domain Generalization Approach for Object Detection [60.20931827772482]
Single Domain Generalization tackles the problem of training a model on a single source domain so that it generalizes to any unseen target domain. We propose to leverage a pre-trained vision-language model to introduce semantic domain concepts via textual prompts. We achieve this via a semantic augmentation strategy acting on the features extracted by the detector backbone, as well as a text-based classification loss.
arXiv Detail & Related papers (2023-01-13T12:01:18Z)
Domain Generalisation for Object Detection under Covariate and Concept Shift [10.32461766065764]
Domain generalisation aims to promote the learning of domain-invariant features while suppressing domain-specific features. An approach to domain generalisation for object detection is proposed, the first such approach applicable to any object detection architecture.
arXiv Detail & Related papers (2022-03-10T11:14:18Z)
Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement [79.2994130944482]
We design a Domain Disentanglement Faster-RCNN (DDF) to eliminate the source-specific information in the features for detection task learning. Our DDF method facilitates the feature disentanglement at the global and local stages, with a Global Triplet Disentanglement (GTD) module and an Instance Similarity Disentanglement (ISD) module. By outperforming state-of-the-art methods on four benchmark UDA object detection tasks, our DDF method is demonstrated to be effective with wide applicability.
arXiv Detail & Related papers (2022-01-06T05:43:01Z)
Calibrated Feature Decomposition for Generalizable Person Re-Identification [82.64133819313186]
Calibrated Feature Decomposition (CFD) module focuses on improving the generalization capacity for person re-identification. A calibrated-and-standardized Batch normalization (CSBN) is designed to learn calibrated person representation.
arXiv Detail & Related papers (2021-11-27T17:12:43Z)
AFAN: Augmented Feature Alignment Network for Cross-Domain Object Detection [90.18752912204778]
Unsupervised domain adaptation for object detection is a challenging problem with many real-world applications. We propose a novel augmented feature alignment network (AFAN) which integrates intermediate domain image generation and domain-adversarial training. Our approach significantly outperforms the state-of-the-art methods on standard benchmarks for both similar and dissimilar domain adaptations.
arXiv Detail & Related papers (2021-06-10T05:01:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.