Returning The Favour: When Regression Benefits From Probabilistic Causal
Knowledge
- URL: http://arxiv.org/abs/2301.11214v2
- Date: Wed, 21 Jun 2023 09:56:21 GMT
- Title: Returning The Favour: When Regression Benefits From Probabilistic Causal
Knowledge
- Authors: Shahine Bouabid, Jake Fawkes, Dino Sejdinovic
- Abstract summary: A directed acyclic graph (DAG) provides valuable prior knowledge that is often discarded in regression tasks in machine learning.
We show that the independences arising from the presence of collider structures in DAGs provide meaningful inductive biases, which constrain the regression hypothesis space and improve predictive performance.
- Score: 9.106412307976067
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A directed acyclic graph (DAG) provides valuable prior knowledge that is
often discarded in regression tasks in machine learning. We show that the
independences arising from the presence of collider structures in DAGs provide
meaningful inductive biases, which constrain the regression hypothesis space
and improve predictive performance. We introduce collider regression, a
framework to incorporate probabilistic causal knowledge from a collider in a
regression problem. When the hypothesis space is a reproducing kernel Hilbert
space, we prove a strictly positive generalisation benefit under mild
assumptions and provide closed-form estimators of the empirical risk minimiser.
Experiments on synthetic and climate model data demonstrate performance gains
of the proposed methodology.
Related papers
- Causal Lifting of Neural Representations: Zero-Shot Generalization for Causal Inferences [56.23412698865433]
We focus on causal inferences on a target experiment with unlabeled factual outcomes, retrieved by a predictive model fine-tuned on a labeled similar experiment.
First, we show that factual outcome estimation via Empirical Risk Minimization (ERM) may fail to yield valid causal inferences on the target population.
We propose Deconfounded Empirical Risk Minimization (DERM), a new simple learning procedure minimizing the risk over a fictitious target population.
arXiv Detail & Related papers (2025-02-10T10:52:17Z) - RieszBoost: Gradient Boosting for Riesz Regression [49.737777802061984]
We propose a novel gradient boosting algorithm to directly estimate the Riesz representer without requiring its explicit analytical form.
We show that our algorithm performs on par with or better than indirect estimation techniques across a range of functionals.
arXiv Detail & Related papers (2025-01-08T23:04:32Z) - Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.
We demonstrate that in this setting, the generalized cross validation estimator (GCV) fails to correctly predict the out-of-sample risk.
We further extend our analysis to the case where the test point has nontrivial correlations with the training set, a setting often encountered in time series forecasting.
arXiv Detail & Related papers (2024-08-08T17:27:29Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Understanding Augmentation-based Self-Supervised Representation Learning
via RKHS Approximation and Regression [53.15502562048627]
Recent work has built the connection between self-supervised learning and the approximation of the top eigenspace of a graph Laplacian operator.
This work delves into a statistical analysis of augmentation-based pretraining.
arXiv Detail & Related papers (2023-06-01T15:18:55Z) - Errors-in-variables Fr\'echet Regression with Low-rank Covariate
Approximation [2.1756081703276]
Fr'echet regression has emerged as a promising approach for regression analysis involving non-Euclidean response variables.
Our proposed framework combines the concepts of global Fr'echet regression and principal component regression, aiming to improve the efficiency and accuracy of the regression estimator.
arXiv Detail & Related papers (2023-05-16T08:37:54Z) - Causal Graph Discovery from Self and Mutually Exciting Time Series [10.410454851418548]
We develop a non-asymptotic recovery guarantee and quantifiable uncertainty by solving a linear program.
We demonstrate the effectiveness of our approach in recovering highly interpretable causal DAGs over Sepsis Associated Derangements (SADs)
arXiv Detail & Related papers (2023-01-26T16:15:27Z) - Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware
Regression [91.3373131262391]
Uncertainty is the only certainty there is.
Traditionally, the direct regression formulation is considered and the uncertainty is modeled by modifying the output space to a certain family of probabilistic distributions.
How to model the uncertainty within the present-day technologies for regression remains an open issue.
arXiv Detail & Related papers (2021-03-25T06:56:09Z) - New Insights into Learning with Correntropy Based Regression [3.066157114715031]
We show that correntropy based regression regresses towards conditional mode function or the conditional mean function robustly under certain conditions.
We also present some new results when it is utilized to learn the conditional mean function.
arXiv Detail & Related papers (2020-06-19T21:14:34Z) - Learning from Non-Random Data in Hilbert Spaces: An Optimal Recovery
Perspective [12.674428374982547]
We consider the regression problem from an Optimal Recovery perspective.
We first develop a semidefinite program for calculating the worst-case error of any recovery map in finite-dimensional Hilbert spaces.
We show that Optimal Recovery provides a formula which is user-friendly from an algorithmic point-of-view.
arXiv Detail & Related papers (2020-06-05T21:49:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.