I-SPEC: An End-to-End Framework for Learning Transportable, Shift-Stable
Models
- URL: http://arxiv.org/abs/2002.08948v1
- Date: Thu, 20 Feb 2020 18:56:04 GMT
- Title: I-SPEC: An End-to-End Framework for Learning Transportable, Shift-Stable
Models
- Authors: Adarsh Subbaswamy, Suchi Saria
- Abstract summary: Shifts in environment between development and deployment cause classical supervised learning to produce models that fail to generalize to new target distributions.
We propose I-SPEC, an end-to-end framework that addresses this shortcoming by using data to learn a partial ancestral graph (PAG)
We apply I-SPEC to a mortality prediction problem to show it can learn a model that is robust to shifts without needing upfront knowledge of the full causal DAG.
- Score: 6.802401545890963
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Shifts in environment between development and deployment cause classical
supervised learning to produce models that fail to generalize well to new
target distributions. Recently, many solutions which find invariant predictive
distributions have been developed. Among these, graph-based approaches do not
require data from the target environment and can capture more stable
information than alternative methods which find stable feature sets. However,
these approaches assume that the data generating process is known in the form
of a full causal graph, which is generally not the case. In this paper, we
propose I-SPEC, an end-to-end framework that addresses this shortcoming by
using data to learn a partial ancestral graph (PAG). Using the PAG we develop
an algorithm that determines an interventional distribution that is stable to
the declared shifts; this subsumes existing approaches which find stable
feature sets that are less accurate. We apply I-SPEC to a mortality prediction
problem to show it can learn a model that is robust to shifts without needing
upfront knowledge of the full causal DAG.
Related papers
- Theoretically Guaranteed Distribution Adaptable Learning [23.121014921407898]
We propose a novel framework called Distribution Adaptable Learning (DAL)
DAL enables the model to effectively track the evolving data distributions.
It can enhance the reusable and evolvable properties of DAL in accommodating evolving distributions.
arXiv Detail & Related papers (2024-11-05T09:10:39Z) - Optimal Classification under Performative Distribution Shift [13.508249764979075]
We propose a novel view in which performative effects are modelled as push-forward measures.
We prove the convexity of the performative risk under a new set of assumptions.
We also establish a connection with adversarially robust classification by reformulating the minimization of the performative risk as a min-max variational problem.
arXiv Detail & Related papers (2024-11-04T12:20:13Z) - DeCaf: A Causal Decoupling Framework for OOD Generalization on Node Classification [14.96980804513399]
Graph Neural Networks (GNNs) are susceptible to distribution shifts, creating vulnerability and security issues in critical domains.
Existing methods that target learning an invariant (feature, structure)-label mapping often depend on oversimplified assumptions about the data generation process.
We introduce a more realistic graph data generation model using Structural Causal Models (SCMs)
We propose a casual decoupling framework, DeCaf, that independently learns unbiased feature-label and structure-label mappings.
arXiv Detail & Related papers (2024-10-27T00:22:18Z) - Empowering Graph Invariance Learning with Deep Spurious Infomax [27.53568333416706]
We introduce a novel graph invariance learning paradigm, which induces a robust and general inductive bias.
EQuAD shows stable and enhanced performance across different degrees of bias in synthetic datasets and challenging real-world datasets up to $31.76%$.
arXiv Detail & Related papers (2024-07-13T14:18:47Z) - SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning [49.94607673097326]
We propose a highly adaptable framework, designated as SimPro, which does not rely on any predefined assumptions about the distribution of unlabeled data.
Our framework, grounded in a probabilistic model, innovatively refines the expectation-maximization algorithm.
Our method showcases consistent state-of-the-art performance across diverse benchmarks and data distribution scenarios.
arXiv Detail & Related papers (2024-02-21T03:39:04Z) - Unleashing the Power of Graph Data Augmentation on Covariate
Distribution Shift [50.98086766507025]
We propose a simple-yet-effective data augmentation strategy, Adversarial Invariant Augmentation (AIA)
AIA aims to extrapolate and generate new environments, while concurrently preserving the original stable features during the augmentation process.
arXiv Detail & Related papers (2022-11-05T07:55:55Z) - Uncertainty-guided Source-free Domain Adaptation [77.3844160723014]
Source-free domain adaptation (SFDA) aims to adapt a classifier to an unlabelled target data set by only using a pre-trained source model.
We propose quantifying the uncertainty in the source model predictions and utilizing it to guide the target adaptation.
arXiv Detail & Related papers (2022-08-16T08:03:30Z) - Handling Distribution Shifts on Graphs: An Invariance Perspective [78.31180235269035]
We formulate the OOD problem on graphs and develop a new invariant learning approach, Explore-to-Extrapolate Risk Minimization (EERM)
EERM resorts to multiple context explorers that are adversarially trained to maximize the variance of risks from multiple virtual environments.
We prove the validity of our method by theoretically showing its guarantee of a valid OOD solution.
arXiv Detail & Related papers (2022-02-05T02:31:01Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Identification of Latent Variables From Graphical Model Residuals [0.0]
We present a novel method to control for the latent space when estimating a DAG by iteratively deriving proxies for the latent space from the residuals of the inferred model.
We show that any improvement of prediction of an outcome is intrinsically capped and cannot rise beyond a certain limit as compared to the confounded model.
arXiv Detail & Related papers (2021-01-07T02:28:49Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.