Related papers: LILI clustering algorithm: Limit Inferior Leaf Interval Integrated into Causal Forest for Causal Interference

LILI clustering algorithm: Limit Inferior Leaf Interval Integrated into Causal Forest for Causal Interference

URL: http://arxiv.org/abs/2507.03271v1
Date: Fri, 04 Jul 2025 03:04:00 GMT
Title: LILI clustering algorithm: Limit Inferior Leaf Interval Integrated into Causal Forest for Causal Interference
Authors: Yiran Dong, Di Fan, Chuanhou Gao,
Abstract summary: Causal forest methods are powerful tools in causal inference.<n>We propose a novel approach that establishes connections between causal trees through the Limit Inferior Leaf Interval (LILI) clustering algorithm.
Score: 1.4875602190483512
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Causal forest methods are powerful tools in causal inference. Similar to traditional random forest in machine learning, causal forest independently considers each causal tree. However, this independence consideration increases the likelihood that classification errors in one tree are repeated in others, potentially leading to significant bias in causal e ect estimation. In this paper, we propose a novel approach that establishes connections between causal trees through the Limit Inferior Leaf Interval (LILI) clustering algorithm. LILIs are constructed based on the leaves of all causal trees, emphasizing the similarity of dataset confounders. When two instances with di erent treatments are grouped into the same leaf across a su cient number of causal trees, they are treated as counterfactual outcomes of each other. Through this clustering mechanism, LILI clustering reduces bias present in traditional causal tree methods and enhances the prediction accuracy for the average treatment e ect (ATE). By integrating LILIs into a causal forest, we develop an e cient causal inference method. Moreover, we explore several key properties of LILI by relating it to the concepts of limit inferior and limit superior in the set theory. Theoretical analysis rigorously proves the convergence of the estimated ATE using LILI clustering. Empirically, extensive comparative experiments demonstrate the superior performance of LILI clustering.

Related papers

Causal Discovery and Classification Using Lempel-Ziv Complexity [2.7309692684728617]
We introduce a novel causality measure and a distance metric derived from Lempel-Ziv complexity. We evaluate the effectiveness of the causality-based decision tree and the distance-based decision tree.
arXiv Detail & Related papers (2024-11-04T08:24:56Z)
Distilling interpretable causal trees from causal forests [0.0]
A high-dimensional distribution of conditional average treatment effects may give accurate, individual-level estimates. This paper proposes the Distilled Causal Tree, a method for distilling a single, interpretable causal tree from a causal forest.
arXiv Detail & Related papers (2024-08-02T05:48:15Z)
Why do Random Forests Work? Understanding Tree Ensembles as Self-Regularizing Adaptive Smoothers [68.76846801719095]
We argue that the current high-level dichotomy into bias- and variance-reduction prevalent in statistics is insufficient to understand tree ensembles. We show that forests can improve upon trees by three distinct mechanisms that are usually implicitly entangled.
arXiv Detail & Related papers (2024-02-02T15:36:43Z)
Causal Temporal Regime Structure Learning [49.77103348208835]
We present CASTOR, a novel method that concurrently learns the Directed Acyclic Graph (DAG) for each regime.<n>We establish the identifiability of the regimes and DAGs within our framework.<n>Experiments show that CASTOR consistently outperforms existing causal discovery models.
arXiv Detail & Related papers (2023-11-02T17:26:49Z)
Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks. The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data. Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z)
Hierarchical Graph Neural Networks for Causal Discovery and Root Cause Localization [52.72490784720227]
REASON consists of Topological Causal Discovery and Individual Causal Discovery. The Topological Causal Discovery component aims to model the fault propagation in order to trace back to the root causes. The Individual Causal Discovery component focuses on capturing abrupt change patterns of a single system entity.
arXiv Detail & Related papers (2023-02-03T20:17:45Z)
What Makes Forest-Based Heterogeneous Treatment Effect Estimators Work? [1.1050303097572156]
We show that both methods can be understood in terms of the same parameters and confounding assumptions under L2 loss. In the randomized setting, both approaches performed akin to the new blended versions in a benchmark study.
arXiv Detail & Related papers (2022-06-21T12:45:07Z)
Active Bayesian Causal Inference [72.70593653185078]
We propose Active Bayesian Causal Inference (ABCI), a fully-Bayesian active learning framework for integrated causal discovery and reasoning. ABCI jointly infers a posterior over causal models and queries of interest. We show that our approach is more data-efficient than several baselines that only focus on learning the full causal graph.
arXiv Detail & Related papers (2022-06-04T22:38:57Z)
Lassoed Tree Boosting [53.56229983630983]
We prove that a gradient boosted tree algorithm with early stopping faster than $n-1/4$ L2 convergence in the large nonparametric space of cadlag functions of bounded sectional variation. Our convergence proofs are based on a novel, general theorem on early stopping with empirical loss minimizers of nested Donsker classes.
arXiv Detail & Related papers (2022-05-22T00:34:41Z)
Modelling hetegeneous treatment effects by quantitle local polynomial decision tree and forest [0.0]
This paper builds on Breiman's 2001 random forest tree (RFT) and Wager et al.'s causal tree to parameterize the nonparametric problem. We propose a decision tree using quantile classification according to fixed rules combined with classical estimation of local samples, which we call the quantile local linear causal tree (QLPRT) and forest (QLPRF)
arXiv Detail & Related papers (2021-11-30T12:02:16Z)
Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues. We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders. We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z)
Causal Expectation-Maximisation [70.45873402967297]
We show that causal inference is NP-hard even in models characterised by polytree-shaped graphs. We introduce the causal EM algorithm to reconstruct the uncertainty about the latent variables from data about categorical manifest variables. We argue that there appears to be an unnoticed limitation to the trending idea that counterfactual bounds can often be computed without knowledge of the structural equations.
arXiv Detail & Related papers (2020-11-04T10:25:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.