Generalized Label Shift Correction via Minimum Uncertainty Principle:
Theory and Algorithm
- URL: http://arxiv.org/abs/2202.13043v1
- Date: Sat, 26 Feb 2022 02:39:47 GMT
- Title: Generalized Label Shift Correction via Minimum Uncertainty Principle:
Theory and Algorithm
- Authors: You-Wei Luo and Chuan-Xian Ren
- Abstract summary: Generalized Label Shift provides an insight into the learning and transfer of desirable knowledge.
We propose a conditional adaptation framework to deal with these challenges.
The results of extensive experiments demonstrate that the proposed model achieves competitive performance.
- Score: 20.361516866096007
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As a fundamental problem in machine learning, dataset shift induces a
paradigm to learn and transfer knowledge under changing environment. Previous
methods assume the changes are induced by covariate, which is less practical
for complex real-world data. We consider the Generalized Label Shift (GLS),
which provides an interpretable insight into the learning and transfer of
desirable knowledge. Current GLS methods: 1) are not well-connected with the
statistical learning theory; 2) usually assume the shifting conditional
distributions will be matched with an implicit transformation, but its explicit
modeling is unexplored. In this paper, we propose a conditional adaptation
framework to deal with these challenges. From the perspective of learning
theory, we prove that the generalization error of conditional adaptation is
lower than previous covariate adaptation. Following the theoretical results, we
propose the minimum uncertainty principle to learn conditional invariant
transformation via discrepancy optimization. Specifically, we propose the
\textit{conditional metric operator} on Hilbert space to characterize the
distinctness of conditional distributions. For finite observations, we prove
that the empirical estimation is always well-defined and will converge to
underlying truth as sample size increases. The results of extensive experiments
demonstrate that the proposed model achieves competitive performance under
different GLS scenarios.
Related papers
- Transformation-Invariant Learning and Theoretical Guarantees for OOD Generalization [34.036655200677664]
This paper focuses on a distribution shift setting where train and test distributions can be related by classes of (data) transformation maps.
We establish learning rules and algorithmic reductions to Empirical Risk Minimization (ERM)
We highlight that the learning rules we derive offer a game-theoretic viewpoint on distribution shift.
arXiv Detail & Related papers (2024-10-30T20:59:57Z) - Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis [82.51626700527837]
Chain-of-shift (CoT) is an efficient method that enables the reasoning ability of large language models by augmenting the query using examples with multiple intermediate steps.
We show that despite the theoretical success of CoT, it fails to provide an accurate generalization when CoT does.
arXiv Detail & Related papers (2024-10-03T03:12:51Z) - When Invariant Representation Learning Meets Label Shift: Insufficiency and Theoretical Insights [16.72787996847537]
Generalized label shift (GLS) is the latest developed one which shows great potential to deal with the complex factors within the shift.
Main results show the insufficiency of invariant representation learning, and prove the sufficiency and necessity of GLS correction for generalization.
We propose a kernel embedding-based correction algorithm (KECA) to minimize the generalization error and achieve successful knowledge transfer.
arXiv Detail & Related papers (2024-06-24T12:47:21Z) - Robust Distributed Learning: Tight Error Bounds and Breakdown Point
under Data Heterogeneity [11.2120847961379]
We consider in this paper a more realistic heterogeneity model, namely (G,B)-gradient dissimilarity, and show that it covers a larger class of learning problems than existing theory.
We also prove a new lower bound on the learning error of any distributed learning algorithm.
arXiv Detail & Related papers (2023-09-24T09:29:28Z) - Effect-Invariant Mechanisms for Policy Generalization [3.701112941066256]
It has been suggested to exploit invariant conditional distributions to learn models that generalize better to unseen environments.
We introduce a relaxation of full invariance called effect-invariance and prove that it is sufficient, under suitable assumptions, for zero-shot policy generalization.
We present empirical results using simulated data and a mobile health intervention dataset to demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-19T14:50:24Z) - Unleashing the Power of Graph Data Augmentation on Covariate
Distribution Shift [50.98086766507025]
We propose a simple-yet-effective data augmentation strategy, Adversarial Invariant Augmentation (AIA)
AIA aims to extrapolate and generate new environments, while concurrently preserving the original stable features during the augmentation process.
arXiv Detail & Related papers (2022-11-05T07:55:55Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - MaxMatch: Semi-Supervised Learning with Worst-Case Consistency [149.03760479533855]
We propose a worst-case consistency regularization technique for semi-supervised learning (SSL)
We present a generalization bound for SSL consisting of the empirical loss terms observed on labeled and unlabeled training data separately.
Motivated by this bound, we derive an SSL objective that minimizes the largest inconsistency between an original unlabeled sample and its multiple augmented variants.
arXiv Detail & Related papers (2022-09-26T12:04:49Z) - Causal Discovery in Heterogeneous Environments Under the Sparse
Mechanism Shift Hypothesis [7.895866278697778]
Machine learning approaches commonly rely on the assumption of independent and identically distributed (i.i.d.) data.
In reality, this assumption is almost always violated due to distribution shifts between environments.
We propose the Mechanism Shift Score (MSS), a score-based approach amenable to various empirical estimators.
arXiv Detail & Related papers (2022-06-04T15:39:30Z) - Chaos is a Ladder: A New Theoretical Understanding of Contrastive
Learning via Augmentation Overlap [64.60460828425502]
We propose a new guarantee on the downstream performance of contrastive learning.
Our new theory hinges on the insight that the support of different intra-class samples will become more overlapped under aggressive data augmentations.
We propose an unsupervised model selection metric ARC that aligns well with downstream accuracy.
arXiv Detail & Related papers (2022-03-25T05:36:26Z) - A One-step Approach to Covariate Shift Adaptation [82.01909503235385]
A default assumption in many machine learning scenarios is that the training and test samples are drawn from the same probability distribution.
We propose a novel one-step approach that jointly learns the predictive model and the associated weights in one optimization.
arXiv Detail & Related papers (2020-07-08T11:35:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.