Related papers: PCaM: A Progressive Focus Attention-Based Information Fusion Method for Improving Vision Transformer Domain Adaptation

PCaM: A Progressive Focus Attention-Based Information Fusion Method for Improving Vision Transformer Domain Adaptation

URL: http://arxiv.org/abs/2506.17232v1
Date: Tue, 27 May 2025 09:48:29 GMT
Title: PCaM: A Progressive Focus Attention-Based Information Fusion Method for Improving Vision Transformer Domain Adaptation
Authors: Zelin Zang, Fei Wang, Liangyu Li, Jinlin Wu, Chunshui Zhao, Zhen Lei, Baigui Sun,
Abstract summary: Unsupervised Domain Adaptation (UDA) aims to transfer knowledge from a labeled source domain to an unlabeled target domain.<n>We propose the Progressive Focus Cross-Attention Mechanism (PCaM)<n>PCaM progressively filters out background information during cross-attention, allowing the model to focus on and fuse discriminative foreground semantics across domains.
Score: 18.973817257766793
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Unsupervised Domain Adaptation (UDA) aims to transfer knowledge from a labeled source domain to an unlabeled target domain. Recent UDA methods based on Vision Transformers (ViTs) have achieved strong performance through attention-based feature alignment. However, we identify a key limitation: foreground object mismatch, where the discrepancy in foreground object size and spatial distribution across domains weakens attention consistency and hampers effective domain alignment. To address this issue, we propose the Progressive Focus Cross-Attention Mechanism (PCaM), which progressively filters out background information during cross-attention, allowing the model to focus on and fuse discriminative foreground semantics across domains. We further introduce an attentional guidance loss that explicitly directs attention toward task-relevant regions, enhancing cross-domain attention consistency. PCaM is lightweight, architecture-agnostic, and easy to integrate into existing ViT-based UDA pipelines. Extensive experiments on Office-Home, DomainNet, VisDA-2017, and remote sensing datasets demonstrate that PCaM significantly improves adaptation performance and achieves new state-of-the-art results, validating the effectiveness of attention-guided foreground fusion for domain adaptation.

Related papers

GrabDAE: An Innovative Framework for Unsupervised Domain Adaptation Utilizing Grab-Mask and Denoise Auto-Encoder [16.244871317281614]
Unsupervised Domain Adaptation (UDA) aims to adapt a model trained on a labeled source domain to an unlabeled target domain by addressing the domain shift. We introduce GrabDAE, an innovative UDA framework designed to tackle domain shift in visual classification tasks.
arXiv Detail & Related papers (2024-10-10T15:19:57Z)
Overcoming Negative Transfer by Online Selection: Distant Domain Adaptation for Fault Diagnosis [42.85741244467877]
The term distant domain adaptation problem' describes the challenge of adapting from a labeled source domain to a significantly disparate unlabeled target domain. This problem exhibits the risk of negative transfer, where extraneous knowledge from the source domain adversely affects the target domain performance. In response to this challenge, we propose a novel Online Selective Adversarial Alignment (OSAA) approach.
arXiv Detail & Related papers (2024-05-25T07:17:47Z)
IIDM: Inter and Intra-domain Mixing for Semi-supervised Domain Adaptation in Semantic Segmentation [46.6002506426648]
Unsupervised domain adaptation (UDA) is the dominant approach to solve this problem. We propose semi-supervised domain adaptation (SSDA) to overcome this limitation. We propose a novel framework that incorporates both Inter and Intra Domain Mixing (IIDM), where inter-domain mixing mitigates the source-target domain gap and intra-domain mixing enriches the available target domain information.
arXiv Detail & Related papers (2023-08-30T08:44:21Z)
AVATAR: Adversarial self-superVised domain Adaptation network for TARget domain [11.764601181046496]
This paper presents an unsupervised domain adaptation (UDA) method for predicting unlabeled target domain data. We propose the Adversarial self-superVised domain Adaptation network for the TARget domain (AVATAR) algorithm. Our proposed model significantly outperforms state-of-the-art methods on three UDA benchmarks.
arXiv Detail & Related papers (2023-04-28T20:31:56Z)
Exploring Consistency in Cross-Domain Transformer for Domain Adaptive Semantic Segmentation [51.10389829070684]
Domain gap can cause discrepancies in self-attention. Due to this gap, the transformer attends to spurious regions or pixels, which deteriorates accuracy on the target domain. We propose adaptation on attention maps with cross-domain attention layers.
arXiv Detail & Related papers (2022-11-27T02:40:33Z)
Joint Attention-Driven Domain Fusion and Noise-Tolerant Learning for Multi-Source Domain Adaptation [2.734665397040629]
Multi-source Unsupervised Domain Adaptation transfers knowledge from multiple source domains with labeled data to an unlabeled target domain. The distribution discrepancy between different domains and the noisy pseudo-labels in the target domain both lead to performance bottlenecks. We propose an approach that integrates Attention-driven Domain fusion and Noise-Tolerant learning (ADNT) to address the two issues mentioned above.
arXiv Detail & Related papers (2022-08-05T01:08:41Z)
Domain-Agnostic Prior for Transfer Semantic Segmentation [197.9378107222422]
Unsupervised domain adaptation (UDA) is an important topic in the computer vision community. We present a mechanism that regularizes cross-domain representation learning with a domain-agnostic prior (DAP) Our research reveals that UDA benefits much from better proxies, possibly from other data modalities.
arXiv Detail & Related papers (2022-04-06T09:13:25Z)
Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement [79.2994130944482]
We design a Domain Disentanglement Faster-RCNN (DDF) to eliminate the source-specific information in the features for detection task learning. Our DDF method facilitates the feature disentanglement at the global and local stages, with a Global Triplet Disentanglement (GTD) module and an Instance Similarity Disentanglement (ISD) module. By outperforming state-of-the-art methods on four benchmark UDA object detection tasks, our DDF method is demonstrated to be effective with wide applicability.
arXiv Detail & Related papers (2022-01-06T05:43:01Z)
Stagewise Unsupervised Domain Adaptation with Adversarial Self-Training for Road Segmentation of Remote Sensing Images [93.50240389540252]
Road segmentation from remote sensing images is a challenging task with wide ranges of application potentials. We propose a novel stagewise domain adaptation model called RoadDA to address the domain shift (DS) issue in this field. Experiment results on two benchmarks demonstrate that RoadDA can efficiently reduce the domain gap and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-08-28T09:29:14Z)
AFAN: Augmented Feature Alignment Network for Cross-Domain Object Detection [90.18752912204778]
Unsupervised domain adaptation for object detection is a challenging problem with many real-world applications. We propose a novel augmented feature alignment network (AFAN) which integrates intermediate domain image generation and domain-adversarial training. Our approach significantly outperforms the state-of-the-art methods on standard benchmarks for both similar and dissimilar domain adaptations.
arXiv Detail & Related papers (2021-06-10T05:01:20Z)
Domain Conditioned Adaptation Network [90.63261870610211]
We propose a Domain Conditioned Adaptation Network (DCAN) to excite distinct convolutional channels with a domain conditioned channel attention mechanism. This is the first work to explore the domain-wise convolutional channel activation for deep DA networks.
arXiv Detail & Related papers (2020-05-14T04:23:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.