Domain Adaptation Principal Component Analysis: base linear method for
learning with out-of-distribution data
- URL: http://arxiv.org/abs/2208.13290v1
- Date: Sun, 28 Aug 2022 21:10:56 GMT
- Title: Domain Adaptation Principal Component Analysis: base linear method for
learning with out-of-distribution data
- Authors: Evgeny M Mirkes, Jonathan Bac, Aziz Fouch\'e, Sergey V. Stasenko,
Andrei Zinovyev and Alexander N. Gorban
- Abstract summary: Domain adaptation is a popular paradigm in modern machine learning.
We present a method called Domain Adaptation Principal Component Analysis (DAPCA)
DAPCA finds a linear reduced data representation useful for solving the domain adaptation task.
- Score: 55.41644538483948
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Domain adaptation is a popular paradigm in modern machine learning which aims
at tackling the problem of divergence between training or validation dataset
possessing labels for learning and testing a classifier (source domain) and a
potentially large unlabeled dataset where the model is exploited (target
domain). The task is to find such a common representation of both source and
target datasets in which the source dataset is informative for training and
such that the divergence between source and target would be minimized. Most
popular solutions for domain adaptation are currently based on training neural
networks that combine classification and adversarial learning modules, which
are data hungry and usually difficult to train. We present a method called
Domain Adaptation Principal Component Analysis (DAPCA) which finds a linear
reduced data representation useful for solving the domain adaptation task.
DAPCA is based on introducing positive and negative weights between pairs of
data points and generalizes the supervised extension of principal component
analysis. DAPCA represents an iterative algorithm such that at each iteration a
simple quadratic optimization problem is solved. The convergence of the
algorithm is guaranteed and the number of iterations is small in practice. We
validate the suggested algorithm on previously proposed benchmarks for solving
the domain adaptation task, and also show the benefit of using DAPCA in the
analysis of single cell omics datasets in biomedical applications. Overall,
DAPCA can serve as a useful preprocessing step in many machine learning
applications leading to reduced dataset representations, taking into account
possible divergence between source and target domains.
Related papers
- Unsupervised domain adaptation with non-stochastic missing data [0.6608945629704323]
We consider unsupervised domain adaptation (UDA) for classification problems in the presence of missing data in the unlabelled target domain.
Imputation is performed in a domain-invariant latent space and leverages indirect supervision from a complete source domain.
We show the benefits of jointly performing adaptation, classification and imputation on datasets.
arXiv Detail & Related papers (2021-09-16T06:37:07Z) - Exploring Data Aggregation and Transformations to Generalize across
Visual Domains [0.0]
This thesis contributes to research on Domain Generalization (DG), Domain Adaptation (DA) and their variations.
We propose new frameworks for Domain Generalization and Domain Adaptation which make use of feature aggregation strategies and visual transformations.
We show how our proposed solutions outperform competitive state-of-the-art approaches in established DG and DA benchmarks.
arXiv Detail & Related papers (2021-08-20T14:58:14Z) - A Review of Single-Source Deep Unsupervised Visual Domain Adaptation [81.07994783143533]
Large-scale labeled training datasets have enabled deep neural networks to excel across a wide range of benchmark vision tasks.
In many applications, it is prohibitively expensive and time-consuming to obtain large quantities of labeled data.
To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled or unlabeled target domain.
arXiv Detail & Related papers (2020-09-01T00:06:50Z) - Adaptive Risk Minimization: Learning to Adapt to Domain Shift [109.87561509436016]
A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution.
In this work, we consider the problem setting of domain generalization, where the training data are structured into domains and there may be multiple test time shifts.
We introduce the framework of adaptive risk minimization (ARM), in which models are directly optimized for effective adaptation to shift by learning to adapt on the training domains.
arXiv Detail & Related papers (2020-07-06T17:59:30Z) - Sequential Domain Adaptation through Elastic Weight Consolidation for
Sentiment Analysis [3.1473798197405944]
We propose a model-independent framework - Sequential Domain Adaptation (SDA)
Our experiments show that the proposed framework enables simple architectures such as CNNs to outperform complex state-of-the-art models in domain adaptation of sentiment analysis (SA)
In addition, we observe that the effectiveness of a harder first Anti-Curriculum ordering of source domains leads to maximum performance.
arXiv Detail & Related papers (2020-07-02T15:21:56Z) - Supervised Domain Adaptation using Graph Embedding [86.3361797111839]
Domain adaptation methods assume that distributions between the two domains are shifted and attempt to realign them.
We propose a generic framework based on graph embedding.
We show that the proposed approach leads to a powerful Domain Adaptation framework.
arXiv Detail & Related papers (2020-03-09T12:25:13Z) - Towards Fair Cross-Domain Adaptation via Generative Learning [50.76694500782927]
Domain Adaptation (DA) targets at adapting a model trained over the well-labeled source domain to the unlabeled target domain lying in different distributions.
We develop a novel Generative Few-shot Cross-domain Adaptation (GFCA) algorithm for fair cross-domain classification.
arXiv Detail & Related papers (2020-03-04T23:25:09Z) - Do We Really Need to Access the Source Data? Source Hypothesis Transfer
for Unsupervised Domain Adaptation [102.67010690592011]
Unsupervised adaptationUDA (UDA) aims to leverage the knowledge learned from a labeled source dataset to solve similar tasks in a new unlabeled domain.
Prior UDA methods typically require to access the source data when learning to adapt the model.
This work tackles a practical setting where only a trained source model is available and how we can effectively utilize such a model without source data to solve UDA problems.
arXiv Detail & Related papers (2020-02-20T03:13:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.