Related papers: Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge

Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge

URL: http://arxiv.org/abs/2301.11214v2
Date: Wed, 21 Jun 2023 09:56:21 GMT
Title: Returning The Favour: When Regression Benefits From Probabilistic Causal Knowledge
Authors: Shahine Bouabid, Jake Fawkes, Dino Sejdinovic
Abstract summary: A directed acyclic graph (DAG) provides valuable prior knowledge that is often discarded in regression tasks in machine learning. We show that the independences arising from the presence of collider structures in DAGs provide meaningful inductive biases, which constrain the regression hypothesis space and improve predictive performance.
Score: 9.106412307976067
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A directed acyclic graph (DAG) provides valuable prior knowledge that is often discarded in regression tasks in machine learning. We show that the independences arising from the presence of collider structures in DAGs provide meaningful inductive biases, which constrain the regression hypothesis space and improve predictive performance. We introduce collider regression, a framework to incorporate probabilistic causal knowledge from a collider in a regression problem. When the hypothesis space is a reproducing kernel Hilbert space, we prove a strictly positive generalisation benefit under mild assumptions and provide closed-form estimators of the empirical risk minimiser. Experiments on synthetic and climate model data demonstrate performance gains of the proposed methodology.

Related papers

Causal Lifting of Neural Representations: Zero-Shot Generalization for Causal Inferences [56.23412698865433]
We focus on causal inferences on a target experiment with unlabeled factual outcomes, retrieved by a predictive model fine-tuned on a labeled similar experiment. First, we show that factual outcome estimation via Empirical Risk Minimization (ERM) may fail to yield valid causal inferences on the target population. We propose Deconfounded Empirical Risk Minimization (DERM), a new simple learning procedure minimizing the risk over a fictitious target population.
arXiv Detail & Related papers (2025-02-10T10:52:17Z)
RieszBoost: Gradient Boosting for Riesz Regression [49.737777802061984]
We propose a novel gradient boosting algorithm to directly estimate the Riesz representer without requiring its explicit analytical form. We show that our algorithm performs on par with or better than indirect estimation techniques across a range of functionals.
arXiv Detail & Related papers (2025-01-08T23:04:32Z)
Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations. We further extend our analysis to the case where the test point has non-trivial correlations with the training set, setting often encountered in time series forecasting. We validate our theory across a variety of high dimensional data.
arXiv Detail & Related papers (2024-08-08T17:27:29Z)
Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks. The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data. Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z)
Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression [53.15502562048627]
Recent work has built the connection between self-supervised learning and the approximation of the top eigenspace of a graph Laplacian operator. This work delves into a statistical analysis of augmentation-based pretraining.
arXiv Detail & Related papers (2023-06-01T15:18:55Z)
Errors-in-variables Fr\'echet Regression with Low-rank Covariate Approximation [2.1756081703276]
Fr'echet regression has emerged as a promising approach for regression analysis involving non-Euclidean response variables. Our proposed framework combines the concepts of global Fr'echet regression and principal component regression, aiming to improve the efficiency and accuracy of the regression estimator.
arXiv Detail & Related papers (2023-05-16T08:37:54Z)
Causal Graph Discovery from Self and Mutually Exciting Time Series [10.410454851418548]
We develop a non-asymptotic recovery guarantee and quantifiable uncertainty by solving a linear program. We demonstrate the effectiveness of our approach in recovering highly interpretable causal DAGs over Sepsis Associated Derangements (SADs)
arXiv Detail & Related papers (2023-01-26T16:15:27Z)
Discovering Latent Causal Variables via Mechanism Sparsity: A New Principle for Nonlinear ICA [81.4991350761909]
Independent component analysis (ICA) refers to an ensemble of methods which formalize this goal and provide estimation procedure for practical application. We show that the latent variables can be recovered up to a permutation if one regularizes the latent mechanisms to be sparse.
arXiv Detail & Related papers (2021-07-21T14:22:14Z)
Causal Graph Discovery from Self and Mutually Exciting Time Series [12.802653884445132]
We develop a non-asymptotic recovery guarantee and quantifiable uncertainty by solving a linear program. We demonstrate the effectiveness of our approach in recovering highly interpretable causal DAGs over Sepsis Associated Derangements (SADs)
arXiv Detail & Related papers (2021-06-04T16:59:24Z)
Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning [57.88785630755165]
Empirical risk minimization (ERM) is the workhorse of machine learning, but its model-agnostic guarantees can fail when we use adaptively collected data. We study a generic importance sampling weighted ERM algorithm for using adaptively collected data to minimize the average of a loss function over a hypothesis class. For policy learning, we provide rate-optimal regret guarantees that close an open gap in the existing literature whenever exploration decays to zero.
arXiv Detail & Related papers (2021-06-03T09:50:13Z)
Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware Regression [91.3373131262391]
Uncertainty is the only certainty there is. Traditionally, the direct regression formulation is considered and the uncertainty is modeled by modifying the output space to a certain family of probabilistic distributions. How to model the uncertainty within the present-day technologies for regression remains an open issue.
arXiv Detail & Related papers (2021-03-25T06:56:09Z)
New Insights into Learning with Correntropy Based Regression [3.066157114715031]
We show that correntropy based regression regresses towards conditional mode function or the conditional mean function robustly under certain conditions. We also present some new results when it is utilized to learn the conditional mean function.
arXiv Detail & Related papers (2020-06-19T21:14:34Z)
Learning from Non-Random Data in Hilbert Spaces: An Optimal Recovery Perspective [12.674428374982547]
We consider the regression problem from an Optimal Recovery perspective. We first develop a semidefinite program for calculating the worst-case error of any recovery map in finite-dimensional Hilbert spaces. We show that Optimal Recovery provides a formula which is user-friendly from an algorithmic point-of-view.
arXiv Detail & Related papers (2020-06-05T21:49:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.