Stable Learning via Sparse Variable Independence
- URL: http://arxiv.org/abs/2212.00992v1
- Date: Fri, 2 Dec 2022 05:59:30 GMT
- Title: Stable Learning via Sparse Variable Independence
- Authors: Han Yu, Peng Cui, Yue He, Zheyan Shen, Yong Lin, Renzhe Xu, Xingxuan
Zhang
- Abstract summary: We propose SVI (Sparse Variable Independence) for the covariate-shift generalization problem.
We introduce sparsity constraint to compensate for the imperfectness of sample reweighting under the finite-sample setting.
Experiments on both synthetic and real-world datasets demonstrate the improvement of SVI.
- Score: 41.632242102167844
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problem of covariate-shift generalization has attracted intensive
research attention. Previous stable learning algorithms employ sample
reweighting schemes to decorrelate the covariates when there is no explicit
domain information about training data. However, with finite samples, it is
difficult to achieve the desirable weights that ensure perfect independence to
get rid of the unstable variables. Besides, decorrelating within stable
variables may bring about high variance of learned models because of the
over-reduced effective sample size. A tremendous sample size is required for
these algorithms to work. In this paper, with theoretical justification, we
propose SVI (Sparse Variable Independence) for the covariate-shift
generalization problem. We introduce sparsity constraint to compensate for the
imperfectness of sample reweighting under the finite-sample setting in previous
methods. Furthermore, we organically combine independence-based sample
reweighting and sparsity-based variable selection in an iterative way to avoid
decorrelating within stable variables, increasing the effective sample size to
alleviate variance inflation. Experiments on both synthetic and real-world
datasets demonstrate the improvement of covariate-shift generalization
performance brought by SVI.
Related papers
- High-dimensional logistic regression with missing data: Imputation, regularization, and universality [7.167672851569787]
We study high-dimensional, ridge-regularized logistic regression.
We provide exact characterizations of both the prediction error and the estimation error.
arXiv Detail & Related papers (2024-10-01T21:41:21Z) - ROTI-GCV: Generalized Cross-Validation for right-ROTationally Invariant Data [1.194799054956877]
Two key tasks in high-dimensional regularized regression are tuning the regularization strength for accurate predictions and estimating the out-of-sample risk.
We introduce a new framework, ROTI-GCV, for reliably performing cross-validation under challenging conditions.
arXiv Detail & Related papers (2024-06-17T15:50:00Z) - Invariant Anomaly Detection under Distribution Shifts: A Causal
Perspective [6.845698872290768]
Anomaly detection (AD) is the machine learning task of identifying highly discrepant abnormal samples.
Under the constraints of a distribution shift, the assumption that training samples and test samples are drawn from the same distribution breaks down.
We attempt to increase the resilience of anomaly detection models to different kinds of distribution shifts.
arXiv Detail & Related papers (2023-12-21T23:20:47Z) - Anomaly Detection with Variance Stabilized Density Estimation [49.46356430493534]
We present a variance-stabilized density estimation problem for maximizing the likelihood of the observed samples.
To obtain a reliable anomaly detector, we introduce a spectral ensemble of autoregressive models for learning the variance-stabilized distribution.
We have conducted an extensive benchmark with 52 datasets, demonstrating that our method leads to state-of-the-art results.
arXiv Detail & Related papers (2023-06-01T11:52:58Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Breaking the Spurious Causality of Conditional Generation via Fairness
Intervention with Corrective Sampling [77.15766509677348]
Conditional generative models often inherit spurious correlations from the training dataset.
This can result in label-conditional distributions that are imbalanced with respect to another latent attribute.
We propose a general two-step strategy to mitigate this issue.
arXiv Detail & Related papers (2022-12-05T08:09:33Z) - Unleashing the Power of Graph Data Augmentation on Covariate
Distribution Shift [50.98086766507025]
We propose a simple-yet-effective data augmentation strategy, Adversarial Invariant Augmentation (AIA)
AIA aims to extrapolate and generate new environments, while concurrently preserving the original stable features during the augmentation process.
arXiv Detail & Related papers (2022-11-05T07:55:55Z) - Selectively increasing the diversity of GAN-generated samples [8.980453507536017]
We propose a novel method to selectively increase the diversity of GAN-generated samples.
We show the superiority of our method in a synthetic benchmark as well as a real-life scenario simulating data from the Zero Degree Calorimeter of ALICE experiment in CERN.
arXiv Detail & Related papers (2022-07-04T16:27:06Z) - Stable Prediction via Leveraging Seed Variable [73.9770220107874]
Previous machine learning methods might exploit subtly spurious correlations in training data induced by non-causal variables for prediction.
We propose a conditional independence test based algorithm to separate causal variables with a seed variable as priori, and adopt them for stable prediction.
Our algorithm outperforms state-of-the-art methods for stable prediction.
arXiv Detail & Related papers (2020-06-09T06:56:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.