Related papers: C-XGBoost: A tree boosting model for causal effect estimation

C-XGBoost: A tree boosting model for causal effect estimation

URL: http://arxiv.org/abs/2404.00751v1
Date: Sun, 31 Mar 2024 17:43:37 GMT
Title: C-XGBoost: A tree boosting model for causal effect estimation
Authors: Niki Kiriakidou, Ioannis E. Livieris, Christos Diou,
Abstract summary: Causal effect estimation aims at estimating the Average Treatment Effect as well as the Conditional Average Treatment Effect of a treatment to an outcome from the available data. We propose a new causal inference model, named C-XGBoost, for the prediction of potential outcomes.
Score: 8.246161706153805
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Causal effect estimation aims at estimating the Average Treatment Effect as well as the Conditional Average Treatment Effect of a treatment to an outcome from the available data. This knowledge is important in many safety-critical domains, where it often needs to be extracted from observational data. In this work, we propose a new causal inference model, named C-XGBoost, for the prediction of potential outcomes. The motivation of our approach is to exploit the superiority of tree-based models for handling tabular data together with the notable property of causal inference neural network-based models to learn representations that are useful for estimating the outcome for both the treatment and non-treatment cases. The proposed model also inherits the considerable advantages of XGBoost model such as efficiently handling features with missing values requiring minimum preprocessing effort, as well as it is equipped with regularization techniques to avoid overfitting/bias. Furthermore, we propose a new loss function for efficiently training the proposed causal inference model. The experimental analysis, which is based on the performance profiles of Dolan and Mor{\'e} as well as on post-hoc and non-parametric statistical tests, provide strong evidence about the effectiveness of the proposed approach.

Related papers

Using LLMs to Directly Guess Conditional Expectations Can Improve Efficiency in Causal Estimation [0.3222802562733787]
We show that predictions made by generative models trained on historical data can be used to improve the performance of these estimators.<n>We consider a case study using a small dataset of online jewelry auctions, and demonstrate that inclusion of LLM-generated guesses as predictors can improve efficiency in estimation.
arXiv Detail & Related papers (2025-10-09T03:34:06Z)
CausalPFN: Amortized Causal Effect Estimation via In-Context Learning [15.645599403885605]
CausalPFN infers causal effects for new observational datasets out-of-the-box.<n>Our approach achieves superior average performance on heterogeneous and average treatment effect estimation benchmarks.<n>CausalPFN provides calibrated uncertainty estimates to support reliable decision-making based on Bayesian principles.
arXiv Detail & Related papers (2025-06-09T16:31:06Z)
Efficient Multi-Agent System Training with Data Influence-Oriented Tree Search [59.75749613951193]
We propose Data Influence-oriented Tree Search (DITS) to guide both tree search and data selection. By leveraging influence scores, we effectively identify the most impactful data for system improvement. We derive influence score estimation methods tailored for non-differentiable metrics.
arXiv Detail & Related papers (2025-02-02T23:20:16Z)
Testing and Improving the Robustness of Amortized Bayesian Inference for Cognitive Models [0.5223954072121659]
Contaminant observations and outliers often cause problems when estimating the parameters of cognitive models. In this study, we test and improve the robustness of parameter estimation using amortized Bayesian inference. The proposed method is straightforward and practical to implement and has a broad applicability in fields where outlier detection or removal is challenging.
arXiv Detail & Related papers (2024-12-29T21:22:24Z)
Influence Functions for Scalable Data Attribution in Diffusion Models [52.92223039302037]
Diffusion models have led to significant advancements in generative modelling. Yet their widespread adoption poses challenges regarding data attribution and interpretability. We develop an influence functions framework to address these challenges.
arXiv Detail & Related papers (2024-10-17T17:59:02Z)
K-Fold Causal BART for CATE Estimation [0.0]
The study employs synthetic and semi-synthetic datasets, including the widely recognized Infant Health and Development Program (IHDP) benchmark dataset. Despite promising results in synthetic scenarios, the IHDP dataset reveals that the proposed model is not state-of-the-art for ATE and CATE estimation.
arXiv Detail & Related papers (2024-09-09T14:36:33Z)
Causal Rule Forest: Toward Interpretable and Precise Treatment Effect Estimation [0.0]
Causal Rule Forest (CRF) is a novel approach to learning hidden patterns from data and transforming the patterns into interpretable multi-level Boolean rules. By training the other interpretable causal inference models with data representation learned by CRF, we can reduce the predictive errors of these models in estimating Heterogeneous Treatment Effects (HTE) and Conditional Average Treatment Effects (CATE) Our experiments underscore the potential of CRF to advance personalized interventions and policies.
arXiv Detail & Related papers (2024-08-27T13:32:31Z)
Estimating Causal Effects from Learned Causal Networks [56.14597641617531]
We propose an alternative paradigm for answering causal-effect queries over discrete observable variables. We learn the causal Bayesian network and its confounding latent variables directly from the observational data. We show that this emphmodel completion learning approach can be more effective than estimand approaches.
arXiv Detail & Related papers (2024-08-26T08:39:09Z)
Causal Fine-Tuning and Effect Calibration of Non-Causal Predictive Models [1.3124513975412255]
This paper proposes techniques to enhance the performance of non-causal models for causal inference using data from randomized experiments. In domains like advertising, customer retention, and precision medicine, non-causal models that predict outcomes under no intervention are often used to score individuals and rank them according to the expected effectiveness of an intervention.
arXiv Detail & Related papers (2024-06-13T20:18:16Z)
Efficient adjustment for complex covariates: Gaining efficiency with DOPE [56.537164957672715]
We propose a framework that accommodates adjustment for any subset of information expressed by the covariates. Based on our theoretical results, we propose the Debiased Outcome-adapted Propensity Estorimator (DOPE) for efficient estimation of the average treatment effect (ATE) Our results show that the DOPE provides an efficient and robust methodology for ATE estimation in various observational settings.
arXiv Detail & Related papers (2024-02-20T13:02:51Z)
A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime. We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z)
Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks. The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data. Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z)
B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under Hidden Confounding [51.74479522965712]
We propose a meta-learner called the B-Learner, which can efficiently learn sharp bounds on the CATE function under limits on hidden confounding. We prove its estimates are valid, sharp, efficient, and have a quasi-oracle property with respect to the constituent estimators under more general conditions than existing methods.
arXiv Detail & Related papers (2023-04-20T18:07:19Z)
An evaluation framework for comparing causal inference models [3.1372269816123994]
We use the proposed evaluation methodology to compare several state-of-the-art causal effect estimation models. The main motivation behind this approach is the elimination of the influence of a small number of instances or simulation on the benchmarking process.
arXiv Detail & Related papers (2022-08-31T21:04:20Z)
Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions [59.284907093349425]
Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models. We provide a language for describing how training data influences predictions, through a causal framework. Our framework bypasses the need to retrain expensive models and allows us to estimate causal effects based on observational data alone.
arXiv Detail & Related papers (2022-07-28T17:36:24Z)
An improved neural network model for treatment effect estimation [3.1372269816123994]
We propose a new model for predicting the potential outcomes and the propensity score, which is based on a neural network architecture. Numerical experiments illustrate that the proposed model reports better treatment effect estimation performance compared to state-of-the-art models.
arXiv Detail & Related papers (2022-05-23T07:56:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.