Generalization Bounds for Causal Regression: Insights, Guarantees and Sensitivity Analysis
- URL: http://arxiv.org/abs/2405.09516v1
- Date: Wed, 15 May 2024 17:17:27 GMT
- Title: Generalization Bounds for Causal Regression: Insights, Guarantees and Sensitivity Analysis
- Authors: Daniel Csillag, Claudio José Struchiner, Guilherme Tegoni Goedert,
- Abstract summary: We propose a theory based on generalization bounds that provides such guarantees.
By introducing a novel change-of-measure inequality, we are able to tightly bound the model loss.
We demonstrate our bounds on semi-synthetic and real data, showcasing their remarkable tightness and practical utility.
- Score: 0.66567375919026
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many algorithms have been recently proposed for causal machine learning. Yet, there is little to no theory on their quality, especially considering finite samples. In this work, we propose a theory based on generalization bounds that provides such guarantees. By introducing a novel change-of-measure inequality, we are able to tightly bound the model loss in terms of the deviation of the treatment propensities over the population, which we show can be empirically limited. Our theory is fully rigorous and holds even in the face of hidden confounding and violations of positivity. We demonstrate our bounds on semi-synthetic and real data, showcasing their remarkable tightness and practical utility.
Related papers
- Uncertainty Regularized Evidential Regression [5.874234972285304]
The Evidential Regression Network (ERN) represents a novel approach that integrates deep learning with Dempster-Shafer's theory.
Specific activation functions must be employed to enforce non-negative values, which is a constraint that compromises model performance.
This paper provides a theoretical analysis of this limitation and introduces an improvement to overcome it.
arXiv Detail & Related papers (2024-01-03T01:18:18Z) - Robust Distributed Learning: Tight Error Bounds and Breakdown Point
under Data Heterogeneity [11.2120847961379]
We consider in this paper a more realistic heterogeneity model, namely (G,B)-gradient dissimilarity, and show that it covers a larger class of learning problems than existing theory.
We also prove a new lower bound on the learning error of any distributed learning algorithm.
arXiv Detail & Related papers (2023-09-24T09:29:28Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - On the Importance of Gradient Norm in PAC-Bayesian Bounds [92.82627080794491]
We propose a new generalization bound that exploits the contractivity of the log-Sobolev inequalities.
We empirically analyze the effect of this new loss-gradient norm term on different neural architectures.
arXiv Detail & Related papers (2022-10-12T12:49:20Z) - Functional Generalized Empirical Likelihood Estimation for Conditional
Moment Restrictions [19.39005034948997]
We propose a new estimation method based on generalized empirical likelihood (GEL)
GEL provides a more general framework and has been shown to enjoy favorable small-sample properties compared to GMM-based estimators.
We provide kernel- and neural network-based implementations of the estimator, which achieve state-of-the-art empirical performance on two conditional moment restriction problems.
arXiv Detail & Related papers (2022-07-11T11:02:52Z) - The Causal Marginal Polytope for Bounding Treatment Effects [9.196779204457059]
We propose a novel way to identify causal effects without constructing a global causal model.
We enforce compatibility between marginals of a causal model and data, without constructing a global causal model.
We call this collection of locally consistent marginals the causal marginal polytope.
arXiv Detail & Related papers (2022-02-28T15:08:22Z) - Can convolutional ResNets approximately preserve input distances? A
frequency analysis perspective [31.897568775099558]
We show that the theoretical link between the regularisation scheme used and bi-Lipschitzness is only valid under conditions which do not hold in practice.
We present a simple constructive algorithm to search for counter examples to the distance preservation condition.
arXiv Detail & Related papers (2021-06-04T13:12:42Z) - Constrained Learning with Non-Convex Losses [119.8736858597118]
Though learning has become a core technology of modern information processing, there is now ample evidence that it can lead to biased, unsafe, and prejudiced solutions.
arXiv Detail & Related papers (2021-03-08T23:10:33Z) - Fundamental Limits and Tradeoffs in Invariant Representation Learning [99.2368462915979]
Many machine learning applications involve learning representations that achieve two competing goals.
Minimax game-theoretic formulation represents a fundamental tradeoff between accuracy and invariance.
We provide an information-theoretic analysis of this general and important problem under both classification and regression settings.
arXiv Detail & Related papers (2020-12-19T15:24:04Z) - The Variational Method of Moments [65.91730154730905]
conditional moment problem is a powerful formulation for describing structural causal parameters in terms of observables.
Motivated by a variational minimax reformulation of OWGMM, we define a very general class of estimators for the conditional moment problem.
We provide algorithms for valid statistical inference based on the same kind of variational reformulations.
arXiv Detail & Related papers (2020-12-17T07:21:06Z) - Relative Deviation Margin Bounds [55.22251993239944]
We give two types of learning bounds, both distribution-dependent and valid for general families, in terms of the Rademacher complexity.
We derive distribution-dependent generalization bounds for unbounded loss functions under the assumption of a finite moment.
arXiv Detail & Related papers (2020-06-26T12:37:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.