Partial Domain Adaptation via Importance Sampling-based Shift Correction
- URL: http://arxiv.org/abs/2507.20191v1
- Date: Sun, 27 Jul 2025 09:19:07 GMT
- Title: Partial Domain Adaptation via Importance Sampling-based Shift Correction
- Authors: Cheng-Jun Guo, Chuan-Xian Ren, You-Wei Luo, Xiao-Lin Xu, Hong Yan,
- Abstract summary: Partial domain adaptation (PDA) is a challenging task in real-world machine learning scenarios.<n>We propose a novel importance sampling-based shift correction (IS$2$C) method, where new labeled data are sampled from a built sampling domain.<n>We provide theoretical guarantees for IS$2$C by proving that the generalization error can be sufficiently dominated by IS$2$C.
- Score: 22.133232771742527
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Partial domain adaptation (PDA) is a challenging task in real-world machine learning scenarios. It aims to transfer knowledge from a labeled source domain to a related unlabeled target domain, where the support set of the source label distribution subsumes the target one. Previous PDA works managed to correct the label distribution shift by weighting samples in the source domain. However, the simple reweighing technique cannot explore the latent structure and sufficiently use the labeled data, and then models are prone to over-fitting on the source domain. In this work, we propose a novel importance sampling-based shift correction (IS$^2$C) method, where new labeled data are sampled from a built sampling domain, whose label distribution is supposed to be the same as the target domain, to characterize the latent structure and enhance the generalization ability of the model. We provide theoretical guarantees for IS$^2$C by proving that the generalization error can be sufficiently dominated by IS$^2$C. In particular, by implementing sampling with the mixture distribution, the extent of shift between source and sampling domains can be connected to generalization error, which provides an interpretable way to build IS$^2$C. To improve knowledge transfer, an optimal transport-based independence criterion is proposed for conditional distribution alignment, where the computation of the criterion can be adjusted to reduce the complexity from $\mathcal{O}(n^3)$ to $\mathcal{O}(n^2)$ in realistic PDA scenarios. Extensive experiments on PDA benchmarks validate the theoretical results and demonstrate the effectiveness of our IS$^2$C over existing methods.
Related papers
- Soft-Masked Semi-Dual Optimal Transport for Partial Domain Adaptation [16.213569477689916]
Partial domain adaptation (PDA) is a general and practical scenario in which the target label space is a subset of the source one.<n>The challenges of PDA exist due to not only domain shift but also the non-identical label spaces of domains.<n>In this paper, a Soft-masked Semi-dual Optimal Transport (SSOT) method is proposed to deal with the PDA problem.
arXiv Detail & Related papers (2025-05-03T03:20:17Z) - Out-Of-Domain Unlabeled Data Improves Generalization [0.7589678255312519]
We propose a novel framework for incorporating unlabeled data into semi-supervised classification problems.
We show that unlabeled samples can be harnessed to narrow the generalization gap.
We validate our claims through experiments conducted on a variety of synthetic and real-world datasets.
arXiv Detail & Related papers (2023-09-29T02:00:03Z) - CAusal and collaborative proxy-tasKs lEarning for Semi-Supervised Domain
Adaptation [20.589323508870592]
Semi-supervised domain adaptation (SSDA) adapts a learner to a new domain by effectively utilizing source domain data and a few labeled target samples.
We show that the proposed model significantly outperforms SOTA methods in terms of effectiveness and generalisability on SSDA datasets.
arXiv Detail & Related papers (2023-03-30T16:48:28Z) - Divide and Contrast: Source-free Domain Adaptation via Adaptive
Contrastive Learning [122.62311703151215]
Divide and Contrast (DaC) aims to connect the good ends of both worlds while bypassing their limitations.
DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals.
We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch.
arXiv Detail & Related papers (2022-11-12T09:21:49Z) - Mitigating Both Covariate and Conditional Shift for Domain
Generalization [14.91361835243516]
Domain generalization (DG) aims to learn a model on several source domains, hoping that the model can generalize well to unseen target domains.
In this paper, a novel DG method is proposed to deal with the distribution shift via Visual Alignment and Uncertainty-guided belief Ensemble (VAUE)
arXiv Detail & Related papers (2022-09-17T05:13:56Z) - Source-Free Domain Adaptation via Distribution Estimation [106.48277721860036]
Domain Adaptation aims to transfer the knowledge learned from a labeled source domain to an unlabeled target domain whose data distributions are different.
Recently, Source-Free Domain Adaptation (SFDA) has drawn much attention, which tries to tackle domain adaptation problem without using source data.
In this work, we propose a novel framework called SFDA-DE to address SFDA task via source Distribution Estimation.
arXiv Detail & Related papers (2022-04-24T12:22:19Z) - KL Guided Domain Adaptation [88.19298405363452]
Domain adaptation is an important problem and often needed for real-world applications.
A common approach in the domain adaptation literature is to learn a representation of the input that has the same distributions over the source and the target domain.
We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples.
arXiv Detail & Related papers (2021-06-14T22:24:23Z) - On Universal Black-Box Domain Adaptation [53.7611757926922]
We study an arguably least restrictive setting of domain adaptation in a sense of practical deployment.
Only the interface of source model is available to the target domain, and where the label-space relations between the two domains are allowed to be different and unknown.
We propose to unify them into a self-training framework, regularized by consistency of predictions in local neighborhoods of target samples.
arXiv Detail & Related papers (2021-04-10T02:21:09Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z) - Sparsely-Labeled Source Assisted Domain Adaptation [64.75698236688729]
This paper proposes a novel Sparsely-Labeled Source Assisted Domain Adaptation (SLSA-DA) algorithm.
Due to the label scarcity problem, the projected clustering is conducted on both the source and target domains.
arXiv Detail & Related papers (2020-05-08T15:37:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.