Related papers: Boosting Causal Additive Models

Boosting Causal Additive Models

URL: http://arxiv.org/abs/2401.06523v1
Date: Fri, 12 Jan 2024 11:43:11 GMT
Title: Boosting Causal Additive Models
Authors: Maximilian Kertel and Nadja Klein
Abstract summary: We present a boosting-based method to learn additive Structural Equation Models (SEMs) from observational data. We introduce a family of score functions based on arbitrary regression techniques, for which we establish necessary conditions to consistently favor the true causal ordering. To address the challenges posed by high-dimensional data sets, we adapt our approach through a component-wise gradient descent in the space of additive SEMs.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We present a boosting-based method to learn additive Structural Equation Models (SEMs) from observational data, with a focus on the theoretical aspects of determining the causal order among variables. We introduce a family of score functions based on arbitrary regression techniques, for which we establish necessary conditions to consistently favor the true causal ordering. Our analysis reveals that boosting with early stopping meets these criteria and thus offers a consistent score function for causal orderings. To address the challenges posed by high-dimensional data sets, we adapt our approach through a component-wise gradient descent in the space of additive SEMs. Our simulation study underlines our theoretical results for lower dimensions and demonstrates that our high-dimensional adaptation is competitive with state-of-the-art methods. In addition, it exhibits robustness with respect to the choice of the hyperparameters making the procedure easy to tune.

Related papers

Preference Learning for AI Alignment: a Causal Perspective [55.2480439325792]
We frame this problem in a causal paradigm, providing the rich toolbox of causality to identify persistent challenges.<n>Inheriting from the literature of causal inference, we identify key assumptions necessary for reliable generalisation.<n>We illustrate failure modes of naive reward models and demonstrate how causally-inspired approaches can improve model robustness.
arXiv Detail & Related papers (2025-06-06T10:45:42Z)
Q-function Decomposition with Intervention Semantics with Factored Action Spaces [51.01244229483353]
We consider Q-functions defined over a lower dimensional projected subspace of the original action space, and study the condition for the unbiasedness of decomposed Q-functions. This leads to a general scheme which we call action decomposed reinforcement learning that uses the projected Q-functions to approximate the Q-function in standard model-free reinforcement learning algorithms.
arXiv Detail & Related papers (2025-04-30T05:26:51Z)
Learning Structural Causal Models from Ordering: Identifiable Flow Models [19.99352354910655]
We introduce a set of flow models that can recover component-wise, invertible transformation of variables. We propose design improvements that enable simultaneous learning of all causal mechanisms. Our method achieves a significant reduction in computational time compared to existing diffusion-based techniques.
arXiv Detail & Related papers (2024-12-13T04:25:56Z)
Optimizing Sequential Recommendation Models with Scaling Laws and Approximate Entropy [104.48511402784763]
Performance Law for SR models aims to theoretically investigate and model the relationship between model performance and data quality. We propose Approximate Entropy (ApEn) to assess data quality, presenting a more nuanced approach compared to traditional data quantity metrics.
arXiv Detail & Related papers (2024-11-30T10:56:30Z)
Smart Predict-then-Optimize Method with Dependent Data: Risk Bounds and Calibration of Autoregression [7.369846475695131]
We present an autoregressive SPO method directly targeting the optimization problem at the decision stage. We conduct experiments to demonstrate the effectiveness of the SPO+ surrogate compared to the absolute loss and the least squares loss.
arXiv Detail & Related papers (2024-11-19T17:02:04Z)
Kernel-Based Differentiable Learning of Non-Parametric Directed Acyclic Graphical Models [17.52142371968811]
Causal discovery amounts to learning a directed acyclic graph (DAG) that encodes a causal model. Recent research has sought to bypass the search by reformulating causal discovery as a continuous optimization problem.
arXiv Detail & Related papers (2024-08-20T16:09:40Z)
Statistical ranking with dynamic covariates [6.729750785106628]
We introduce an efficient alternating algorithm to compute the likelihood estimator (MLE) A comprehensive numerical study is conducted to corroborate our theoretical findings and demonstrate the application of the proposed model to real-world datasets, including horse racing and tennis competitions.
arXiv Detail & Related papers (2024-06-24T10:26:05Z)
Connectivity Shapes Implicit Regularization in Matrix Factorization Models for Matrix Completion [2.8948274245812335]
We investigate the implicit regularization of matrix factorization for solving matrix completion problems. We empirically discover that the connectivity of observed data plays a crucial role in the implicit bias. Our work reveals the intricate interplay between data connectivity, training dynamics, and implicit regularization in matrix factorization models.
arXiv Detail & Related papers (2024-05-22T15:12:14Z)
Towards Learning Stochastic Population Models by Gradient Descent [0.0]
We show that simultaneous estimation of parameters and structure poses major challenges for optimization procedures. We demonstrate accurate estimation of models but find that enforcing the inference of parsimonious, interpretable models drastically increases the difficulty.
arXiv Detail & Related papers (2024-04-10T14:38:58Z)
A Fixed-Point Approach for Causal Generative Modeling [20.88890689294816]
We propose a novel formalism for describing Structural Causal Models (SCMs) as fixed-point problems on causally ordered variables. We establish the weakest known conditions for their unique recovery given the topological ordering (TO)
arXiv Detail & Related papers (2024-04-10T12:29:05Z)
Physics-Informed Diffusion Models [0.0]
We present a framework that unifies generative modeling and partial differential equation fulfillment. Our approach reduces the residual error by up to two orders of magnitude compared to previous work in a fluid flow case study.
arXiv Detail & Related papers (2024-03-21T13:52:55Z)
A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime. We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z)
When to Update Your Model: Constrained Model-based Reinforcement Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL) Our follow-up derived bounds reveal the relationship between model shifts and performance improvement. A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z)
Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test. We train a variational inference model to predict the causal structure from observational/interventional data. Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z)
Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores) For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training. We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z)
Understanding Implicit Regularization in Over-Parameterized Single Index Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model. We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z)
Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference. We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.