Beyond $\mathcal{H}$-Divergence: Domain Adaptation Theory With
Jensen-Shannon Divergence
- URL: http://arxiv.org/abs/2007.15567v1
- Date: Thu, 30 Jul 2020 16:19:59 GMT
- Title: Beyond $\mathcal{H}$-Divergence: Domain Adaptation Theory With
Jensen-Shannon Divergence
- Authors: Changjian Shui, Qi Chen, Jun Wen, Fan Zhou, Christian Gagn\'e, Boyu
Wang
- Abstract summary: We reveal the incoherence between the widely-adopted empirical domain adversarial training and its generally-assumed theoretical counterpart based on $mathcalH$-divergence.
We establish a new theoretical framework by directly proving the upper and lower target risk bounds based on joint distributional Jensen-Shannon divergence.
- Score: 21.295136514836788
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We reveal the incoherence between the widely-adopted empirical domain
adversarial training and its generally-assumed theoretical counterpart based on
$\mathcal{H}$-divergence. Concretely, we find that $\mathcal{H}$-divergence is
not equivalent to Jensen-Shannon divergence, the optimization objective in
domain adversarial training. To this end, we establish a new theoretical
framework by directly proving the upper and lower target risk bounds based on
joint distributional Jensen-Shannon divergence. We further derive
bi-directional upper bounds for marginal and conditional shifts. Our framework
exhibits inherent flexibilities for different transfer learning problems, which
is usable for various scenarios where $\mathcal{H}$-divergence-based theory
fails to adapt. From an algorithmic perspective, our theory enables a generic
guideline unifying principles of semantic conditional matching, feature
marginal matching, and label marginal shift correction. We employ algorithms
for each principle and empirically validate the benefits of our framework on
real datasets.
Related papers
- A Fresh Look at Generalized Category Discovery through Non-negative Matrix Factorization [83.12938977698988]
Generalized Category Discovery (GCD) aims to classify both base and novel images using labeled base data.
Current approaches inadequately address the intrinsic optimization of the co-occurrence matrix $barA$ based on cosine similarity.
We propose a Non-Negative Generalized Category Discovery (NN-GCD) framework to address these deficiencies.
arXiv Detail & Related papers (2024-10-29T07:24:11Z) - Domain Agnostic Conditional Invariant Predictions for Domain Generalization [20.964740750976667]
We propose a Discriminant Risk Minimization (DRM) theory and the corresponding algorithm to capture the invariant features without domain labels.
In DRM theory, we prove that reducing the discrepancy of prediction distribution between overall source domain and any subset of it can contribute to obtaining invariant features.
We evaluate our algorithm against various domain generalization methods on multiple real-world datasets.
arXiv Detail & Related papers (2024-06-09T02:38:52Z) - Domain Adaptation with Cauchy-Schwarz Divergence [39.36943882475589]
We introduce Cauchy-Schwarz divergence to the problem of unsupervised domain adaptation (UDA)
The CS divergence offers a theoretically tighter generalization error bound than the popular Kullback-Leibler divergence.
We show how the CS divergence can be conveniently used in both distance metric- or adversarial training-based UDA frameworks.
arXiv Detail & Related papers (2024-05-30T12:01:12Z) - Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Theoretical Guarantees for Domain Adaptation with Hierarchical Optimal
Transport [0.0]
Domain adaptation arises as an important problem in statistical learning theory.
Recent advances show that the success of domain adaptation algorithms heavily relies on their ability to minimize the divergence between the probability distributions of the source and target domains.
We propose a new theoretical framework for domain adaptation through hierarchical optimal transport.
arXiv Detail & Related papers (2022-10-24T15:34:09Z) - Domain-Specific Risk Minimization for Out-of-Distribution Generalization [104.17683265084757]
We first establish a generalization bound that explicitly considers the adaptivity gap.
We propose effective gap estimation methods for guiding the selection of a better hypothesis for the target.
The other method is minimizing the gap directly by adapting model parameters using online target samples.
arXiv Detail & Related papers (2022-08-18T06:42:49Z) - A Theory of Label Propagation for Subpopulation Shift [61.408438422417326]
We propose a provably effective framework for domain adaptation based on label propagation.
We obtain end-to-end finite-sample guarantees on the entire algorithm.
We extend our theoretical framework to a more general setting of source-to-target transfer based on a third unlabeled dataset.
arXiv Detail & Related papers (2021-02-22T17:27:47Z) - A Convenient Infinite Dimensional Framework for Generative Adversarial
Learning [4.396860522241306]
We propose an infinite dimensional theoretical framework for generative adversarial learning.
In our framework the Jensen-Shannon divergence between the distribution induced by the generator from the adversarial learning procedure and the data generating distribution converges to zero.
arXiv Detail & Related papers (2020-11-24T13:45:17Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z) - Joint Contrastive Learning for Unsupervised Domain Adaptation [20.799729748233343]
We propose an alternative upper bound on the target error that explicitly considers the joint error to render it more manageable.
We introduce Joint Contrastive Learning to find class-level discriminative features, which is essential for minimizing the joint error.
Experiments on two real-world datasets demonstrate that JCL outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2020-06-18T06:25:34Z) - Stochastic Flows and Geometric Optimization on the Orthogonal Group [52.50121190744979]
We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$.
We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
arXiv Detail & Related papers (2020-03-30T15:37:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.