Which Invariance Should We Transfer? A Causal Minimax Learning Approach
- URL: http://arxiv.org/abs/2107.01876v5
- Date: Tue, 30 May 2023 13:37:27 GMT
- Title: Which Invariance Should We Transfer? A Causal Minimax Learning Approach
- Authors: Mingzhou Liu, Xiangyu Zheng, Xinwei Sun, Fang Fang, Yizhou Wang
- Abstract summary: We present a comprehensive minimax analysis from a causal perspective.
We propose an efficient algorithm to search for the subset with minimal worst-case risk.
The effectiveness and efficiency of our methods are demonstrated on synthetic data and the diagnosis of Alzheimer's disease.
- Score: 18.71316951734806
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A major barrier to deploying current machine learning models lies in their
non-reliability to dataset shifts. To resolve this problem, most existing
studies attempted to transfer stable information to unseen environments.
Particularly, independent causal mechanisms-based methods proposed to remove
mutable causal mechanisms via the do-operator. Compared to previous methods,
the obtained stable predictors are more effective in identifying stable
information. However, a key question remains: which subset of this whole stable
information should the model transfer, in order to achieve optimal
generalization ability? To answer this question, we present a comprehensive
minimax analysis from a causal perspective. Specifically, we first provide a
graphical condition for the whole stable set to be optimal. When this condition
fails, we surprisingly find with an example that this whole stable set,
although can fully exploit stable information, is not the optimal one to
transfer. To identify the optimal subset under this case, we propose to
estimate the worst-case risk with a novel optimization scheme over the
intervention functions on mutable causal mechanisms. We then propose an
efficient algorithm to search for the subset with minimal worst-case risk,
based on a newly defined equivalence relation between stable subsets. Compared
to the exponential cost of exhaustively searching over all subsets, our
searching strategy enjoys a polynomial complexity. The effectiveness and
efficiency of our methods are demonstrated on synthetic data and the diagnosis
of Alzheimer's disease.
Related papers
- Rigorous Probabilistic Guarantees for Robust Counterfactual Explanations [80.86128012438834]
We show for the first time that computing the robustness of counterfactuals with respect to plausible model shifts is NP-complete.
We propose a novel probabilistic approach which is able to provide tight estimates of robustness with strong guarantees.
arXiv Detail & Related papers (2024-07-10T09:13:11Z) - Non-Convex Robust Hypothesis Testing using Sinkhorn Uncertainty Sets [18.46110328123008]
We present a new framework to address the non-robust hypothesis testing problem.
The goal is to seek the optimal detector that minimizes the maximum numerical risk.
arXiv Detail & Related papers (2024-03-21T20:29:43Z) - Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.
We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z) - Maximum Likelihood Estimation is All You Need for Well-Specified
Covariate Shift [34.414261291690856]
Key challenge of modern machine learning systems is to achieve Out-of-Distribution (OOD) generalization.
We show that classical Maximum Likelihood Estimation (MLE) purely using source data achieves the minimax optimality.
We illustrate the wide applicability of our framework by instantiating it to three concrete examples.
arXiv Detail & Related papers (2023-11-27T16:06:48Z) - Domain Adaptation under Missingness Shift [38.650099178537864]
We introduce the problem of Domain Adaptation under Missingness Shift (DAMS)
Rates of missing data often depend on record-keeping policies and thus may change across times and locations.
In experiments on synthetic and semi-synthetic data, we demonstrate the promise of our methods when assumptions hold.
arXiv Detail & Related papers (2022-11-03T18:49:38Z) - Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium
Learning from Offline Datasets [101.5329678997916]
We study episodic two-player zero-sum Markov games (MGs) in the offline setting.
The goal is to find an approximate Nash equilibrium (NE) policy pair based on a dataset collected a priori.
arXiv Detail & Related papers (2022-02-15T15:39:30Z) - Complexity-Free Generalization via Distributionally Robust Optimization [4.313143197674466]
We present an alternate route to obtain generalization bounds on the solution from distributionally robust optimization (DRO)
Our DRO bounds depend on the ambiguity set geometry and its compatibility with the true loss function.
Notably, when using maximum mean discrepancy as a DRO distance metric, our analysis implies, to the best of our knowledge, the first generalization bound in the literature that depends solely on the true loss function.
arXiv Detail & Related papers (2021-06-21T15:19:52Z) - Navigating to the Best Policy in Markov Decision Processes [68.8204255655161]
We investigate the active pure exploration problem in Markov Decision Processes.
Agent sequentially selects actions and, from the resulting system trajectory, aims at the best as fast as possible.
arXiv Detail & Related papers (2021-06-05T09:16:28Z) - Evaluating Model Robustness and Stability to Dataset Shift [7.369475193451259]
We propose a framework for analyzing stability of machine learning models.
We use the original evaluation data to determine distributions under which the algorithm performs poorly.
We estimate the algorithm's performance on the "worst-case" distribution.
arXiv Detail & Related papers (2020-10-28T17:35:39Z) - The Risks of Invariant Risk Minimization [52.7137956951533]
Invariant Risk Minimization is an objective based on the idea for learning deep, invariant features of data.
We present the first analysis of classification under the IRM objective--as well as these recently proposed alternatives--under a fairly natural and general model.
We show that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution--this is precisely the issue that it was intended to solve.
arXiv Detail & Related papers (2020-10-12T14:54:32Z) - Distributionally Robust Bayesian Optimization [121.71766171427433]
We present a novel distributionally robust Bayesian optimization algorithm (DRBO) for zeroth-order, noisy optimization.
Our algorithm provably obtains sub-linear robust regret in various settings.
We demonstrate the robust performance of our method on both synthetic and real-world benchmarks.
arXiv Detail & Related papers (2020-02-20T22:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.