Related papers: Which Invariance Should We Transfer? A Causal Minimax Learning Approach

Which Invariance Should We Transfer? A Causal Minimax Learning Approach

URL: http://arxiv.org/abs/2107.01876v5
Date: Tue, 30 May 2023 13:37:27 GMT
Title: Which Invariance Should We Transfer? A Causal Minimax Learning Approach
Authors: Mingzhou Liu, Xiangyu Zheng, Xinwei Sun, Fang Fang, Yizhou Wang
Abstract summary: We present a comprehensive minimax analysis from a causal perspective. We propose an efficient algorithm to search for the subset with minimal worst-case risk. The effectiveness and efficiency of our methods are demonstrated on synthetic data and the diagnosis of Alzheimer's disease.
Score: 18.71316951734806
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A major barrier to deploying current machine learning models lies in their non-reliability to dataset shifts. To resolve this problem, most existing studies attempted to transfer stable information to unseen environments. Particularly, independent causal mechanisms-based methods proposed to remove mutable causal mechanisms via the do-operator. Compared to previous methods, the obtained stable predictors are more effective in identifying stable information. However, a key question remains: which subset of this whole stable information should the model transfer, in order to achieve optimal generalization ability? To answer this question, we present a comprehensive minimax analysis from a causal perspective. Specifically, we first provide a graphical condition for the whole stable set to be optimal. When this condition fails, we surprisingly find with an example that this whole stable set, although can fully exploit stable information, is not the optimal one to transfer. To identify the optimal subset under this case, we propose to estimate the worst-case risk with a novel optimization scheme over the intervention functions on mutable causal mechanisms. We then propose an efficient algorithm to search for the subset with minimal worst-case risk, based on a newly defined equivalence relation between stable subsets. Compared to the exponential cost of exhaustively searching over all subsets, our searching strategy enjoys a polynomial complexity. The effectiveness and efficiency of our methods are demonstrated on synthetic data and the diagnosis of Alzheimer's disease.

Related papers

Statistical Inference for Conditional Group Distributionally Robust Optimization with Cross-Entropy Loss [9.054486124506521]
We study multi-source unsupervised domain adaptation, where labeled data are drawn from multiple source domains and only unlabeled data from a target domain.<n>We propose a novel Conditional Conditional Optimization (CG-DRO) framework that learns a classifier by minimizing the worst-case cross-entropy loss over the convex combinations of the conditional outcome distributions from the sources.<n>We establish fast statistical convergence rates for the estimator by constructing two surrogate minimax optimization problems that serve as theoretical bridges.
arXiv Detail & Related papers (2025-07-14T04:21:23Z)
Forecasting Outside the Box: Application-Driven Optimal Pointwise Forecasts for Stochastic Optimization [0.0]
We present an integrated learning and optimization procedure that yields the best approximation of an unknown situation. Numerical results conducted with inventory problems from the literature as well as a bike-sharing problem with real data demonstrate that the proposed approach performs well.
arXiv Detail & Related papers (2024-11-05T21:54:50Z)
Generalization Bounds of Surrogate Policies for Combinatorial Optimization Problems [53.03951222945921]
We analyze smoothed (perturbed) policies, adding controlled random perturbations to the direction used by the linear oracle.<n>Our main contribution is a generalization bound that decomposes the excess risk into perturbation bias, statistical estimation error, and optimization error.<n>We illustrate the scope of the results on applications such as vehicle scheduling, highlighting how smoothing enables both tractable training and controlled generalization.
arXiv Detail & Related papers (2024-07-24T12:00:30Z)
Rigorous Probabilistic Guarantees for Robust Counterfactual Explanations [80.86128012438834]
We show for the first time that computing the robustness of counterfactuals with respect to plausible model shifts is NP-complete. We propose a novel probabilistic approach which is able to provide tight estimates of robustness with strong guarantees.
arXiv Detail & Related papers (2024-07-10T09:13:11Z)
Non-Convex Robust Hypothesis Testing using Sinkhorn Uncertainty Sets [18.46110328123008]
We present a new framework to address the non-robust hypothesis testing problem. The goal is to seek the optimal detector that minimizes the maximum numerical risk.
arXiv Detail & Related papers (2024-03-21T20:29:43Z)
Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions. We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z)
Maximum Likelihood Estimation is All You Need for Well-Specified Covariate Shift [34.414261291690856]
Key challenge of modern machine learning systems is to achieve Out-of-Distribution (OOD) generalization. We show that classical Maximum Likelihood Estimation (MLE) purely using source data achieves the minimax optimality. We illustrate the wide applicability of our framework by instantiating it to three concrete examples.
arXiv Detail & Related papers (2023-11-27T16:06:48Z)
Domain Adaptation under Missingness Shift [38.650099178537864]
We introduce the problem of Domain Adaptation under Missingness Shift (DAMS) Rates of missing data often depend on record-keeping policies and thus may change across times and locations. In experiments on synthetic and semi-synthetic data, we demonstrate the promise of our methods when assumptions hold.
arXiv Detail & Related papers (2022-11-03T18:49:38Z)
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets [101.5329678997916]
We study episodic two-player zero-sum Markov games (MGs) in the offline setting. The goal is to find an approximate Nash equilibrium (NE) policy pair based on a dataset collected a priori.
arXiv Detail & Related papers (2022-02-15T15:39:30Z)
Complexity-Free Generalization via Distributionally Robust Optimization [4.313143197674466]
We present an alternate route to obtain generalization bounds on the solution from distributionally robust optimization (DRO) Our DRO bounds depend on the ambiguity set geometry and its compatibility with the true loss function. Notably, when using maximum mean discrepancy as a DRO distance metric, our analysis implies, to the best of our knowledge, the first generalization bound in the literature that depends solely on the true loss function.
arXiv Detail & Related papers (2021-06-21T15:19:52Z)
Navigating to the Best Policy in Markov Decision Processes [68.8204255655161]
We investigate the active pure exploration problem in Markov Decision Processes. Agent sequentially selects actions and, from the resulting system trajectory, aims at the best as fast as possible.
arXiv Detail & Related papers (2021-06-05T09:16:28Z)
Evaluating Model Robustness and Stability to Dataset Shift [7.369475193451259]
We propose a framework for analyzing stability of machine learning models. We use the original evaluation data to determine distributions under which the algorithm performs poorly. We estimate the algorithm's performance on the "worst-case" distribution.
arXiv Detail & Related papers (2020-10-28T17:35:39Z)
The Risks of Invariant Risk Minimization [52.7137956951533]
Invariant Risk Minimization is an objective based on the idea for learning deep, invariant features of data. We present the first analysis of classification under the IRM objective--as well as these recently proposed alternatives--under a fairly natural and general model. We show that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution--this is precisely the issue that it was intended to solve.
arXiv Detail & Related papers (2020-10-12T14:54:32Z)
Distributionally Robust Bayesian Optimization [121.71766171427433]
We present a novel distributionally robust Bayesian optimization algorithm (DRBO) for zeroth-order, noisy optimization. Our algorithm provably obtains sub-linear robust regret in various settings. We demonstrate the robust performance of our method on both synthetic and real-world benchmarks.
arXiv Detail & Related papers (2020-02-20T22:04:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.