Robust Classification under Class-Dependent Domain Shift
- URL: http://arxiv.org/abs/2007.05335v1
- Date: Fri, 10 Jul 2020 12:26:57 GMT
- Title: Robust Classification under Class-Dependent Domain Shift
- Authors: Tigran Galstyan, Hrant Khachatrian, Greg Ver Steeg, Aram Galstyan
- Abstract summary: In this paper we explore a special type of dataset shift which we call class-dependent domain shift.
It is characterized by the following features: the input data causally depends on the label, the shift in the data is fully explained by a known variable, the variable which controls the shift can depend on the label, there is no shift in the label distribution.
- Score: 29.54336432319199
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Investigation of machine learning algorithms robust to changes between the
training and test distributions is an active area of research. In this paper we
explore a special type of dataset shift which we call class-dependent domain
shift. It is characterized by the following features: the input data causally
depends on the label, the shift in the data is fully explained by a known
variable, the variable which controls the shift can depend on the label, there
is no shift in the label distribution. We define a simple optimization problem
with an information theoretic constraint and attempt to solve it with neural
networks. Experiments on a toy dataset demonstrate the proposed method is able
to learn robust classifiers which generalize well to unseen domains.
Related papers
- Automatic dataset shift identification to support root cause analysis of AI performance drift [13.996602963045387]
Shifts in data distribution can substantially harm the performance of clinical AI models.
We propose the first unsupervised dataset shift identification framework.
We report promising results for the proposed framework on five types of real-world dataset shifts.
arXiv Detail & Related papers (2024-11-12T17:09:20Z) - Semi-Supervised Variational Adversarial Active Learning via Learning to Rank and Agreement-Based Pseudo Labeling [6.771578432805963]
Active learning aims to alleviate the amount of labor involved in data labeling by automating the selection of unlabeled samples.
We introduce novel techniques that significantly improve the use of abundant unlabeled data during training.
We demonstrate the superior performance of our approach over the state of the art on various image classification and segmentation benchmark datasets.
arXiv Detail & Related papers (2024-08-23T00:35:07Z) - Adversarial Learning for Feature Shift Detection and Correction [45.65548560695731]
Feature shifts can occur in many datasets, including in multi-sensor data, where some sensors are malfunctioning, or in structured data, where faulty standardization and data processing pipelines can lead to erroneous features.
In this work, we explore using the principles of adversarial learning, where the information from several discriminators trained to distinguish between two distributions is used to both detect the corrupted features and fix them in order to remove the distribution shift between datasets.
arXiv Detail & Related papers (2023-12-07T18:58:40Z) - Learning Invariant Molecular Representation in Latent Discrete Space [52.13724532622099]
We propose a new framework for learning molecular representations that exhibit invariance and robustness against distribution shifts.
Our model achieves stronger generalization against state-of-the-art baselines in the presence of various distribution shifts.
arXiv Detail & Related papers (2023-10-22T04:06:44Z) - Binary Quantification and Dataset Shift: An Experimental Investigation [54.14283123210872]
Quantification is the supervised learning task that consists of training predictors of the class prevalence values of sets of unlabelled data.
The relationship between quantification and other types of dataset shift remains, by and large, unexplored.
We propose a fine-grained taxonomy of types of dataset shift, by establishing protocols for the generation of datasets affected by these types of shift.
arXiv Detail & Related papers (2023-10-06T20:11:27Z) - Adapting to Latent Subgroup Shifts via Concepts and Proxies [82.01141290360562]
We show that the optimal target predictor can be non-parametrically identified with the help of concept and proxy variables available only in the source domain.
For continuous observations, we propose a latent variable model specific to the data generation process at hand.
arXiv Detail & Related papers (2022-12-21T18:30:22Z) - Identifiable Latent Causal Content for Domain Adaptation under Latent Covariate Shift [82.14087963690561]
Multi-source domain adaptation (MSDA) addresses the challenge of learning a label prediction function for an unlabeled target domain.
We present an intricate causal generative model by introducing latent noises across domains, along with a latent content variable and a latent style variable.
The proposed approach showcases exceptional performance and efficacy on both simulated and real-world datasets.
arXiv Detail & Related papers (2022-08-30T11:25:15Z) - A unified framework for dataset shift diagnostics [2.449909275410288]
Supervised learning techniques typically assume training data originates from the target population.
Yet, dataset shift frequently arises, which, if not adequately taken into account, may decrease the performance of their predictors.
We propose a novel and flexible framework called DetectShift that quantifies and tests for multiple dataset shifts.
arXiv Detail & Related papers (2022-05-17T13:34:45Z) - Self-training Avoids Using Spurious Features Under Domain Shift [54.794607791641745]
In unsupervised domain adaptation, conditional entropy minimization and pseudo-labeling work even when the domain shifts are much larger than those analyzed by existing theory.
We identify and analyze one particular setting where the domain shift can be large, but certain spurious features correlate with label in the source domain but are independent label in the target.
arXiv Detail & Related papers (2020-06-17T17:51:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.