Related papers: Learning Subgroups with Maximum Treatment Effects without Causal Heuristics

Learning Subgroups with Maximum Treatment Effects without Causal Heuristics

URL: http://arxiv.org/abs/2511.20189v1
Date: Tue, 25 Nov 2025 11:13:05 GMT
Title: Learning Subgroups with Maximum Treatment Effects without Causal Heuristics
Authors: Lincen Yang, Zhong Li, Matthijs van Leeuwen, Saber Salehkaleybar,
Abstract summary: We show that optimal subgroup discovery reduces to recovering the data-generating models and hence a standard supervised learning problem.<n>We instantiate the approach with CART, arguably one of the most widely used tree-based methods, to learn the subgroup with maximum treatment effect.
Score: 16.087398572596587
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Discovering subgroups with the maximum average treatment effect is crucial for targeted decision making in domains such as precision medicine, public policy, and education. While most prior work is formulated in the potential outcome framework, the corresponding structural causal model (SCM) for this task has been largely overlooked. In practice, two approaches dominate. The first estimates pointwise conditional treatment effects and then fits a tree on those estimates, effectively turning subgroup estimation into the harder problem of accurate pointwise estimation. The second constructs decision trees or rule sets with ad-hoc 'causal' heuristics, typically without rigorous justification for why a given heuristic may be used or whether such heuristics are necessary at all. We address these issues by studying the problem directly under the SCM framework. Under the assumption of a partition-based model, we show that optimal subgroup discovery reduces to recovering the data-generating models and hence a standard supervised learning problem (regression or classification). This allows us to adopt any partition-based methods to learn the subgroup from data. We instantiate the approach with CART, arguably one of the most widely used tree-based methods, to learn the subgroup with maximum treatment effect. Finally, on a large collection of synthetic and semi-synthetic datasets, we compare our method against a wide range of baselines and find that our approach, which avoids such causal heuristics, more accurately identifies subgroups with maximum treatment effect. Our source code is available at https://github.com/ylincen/causal-subgroup.

Related papers

Efficient Subgroup Analysis via Optimal Trees with Global Parameter Fusion [4.874780144224057]
Subgroup analysis allows practitioners to pinpoint populations for whom a treatment is especially beneficial or protective.<n>We propose a fused optimal causal tree method that leverages mixed integer optimization (MIO) to facilitate precise subgroup identification.<n>We provide theoretical guarantees by rigorously establishing out of sample risk bounds and comparing them with those of classical tree based methods.
arXiv Detail & Related papers (2026-02-03T23:26:19Z)
Subgroup Discovery with the Cox Model [3.6443246757008723]
We study the problem of subgroup discovery for survival analysis.<n>The goal is to find an interpretable subset of the data on which a Cox model is highly accurate.<n>We introduce a total of eight algorithms for the Cox subgroup discovery problem.
arXiv Detail & Related papers (2025-12-23T20:49:05Z)
Causal Clustering for Conditional Average Treatment Effects Estimation and Subgroup Discovery [5.669361767058639]
Estimating heterogeneous treatment effects is critical in domains such as personalized medicine, resource allocation, and policy evaluation.<n>We propose a novel framework that clusters individuals based on estimated treatment effects using a learned kernel derived from causal forests.
arXiv Detail & Related papers (2025-09-06T17:01:23Z)
M-learner:A Flexible And Powerful Framework To Study Heterogeneous Treatment Effect In Mediation Model [11.977166290154125]
We propose a novel method, termed the M-learner, for estimating heterogeneous indirect and total treatment effects.<n>To the best of our knowledge, this is the first approach specifically designed to capture treatment effect heterogeneity in the presence of mediation.
arXiv Detail & Related papers (2025-05-23T13:57:23Z)
Learning Deep Tree-based Retriever for Efficient Recommendation: Theory and Method [76.31185707649227]
We propose a Deep Tree-based Retriever (DTR) for efficient recommendation. DTR frames the training task as a softmax-based multi-class classification over tree nodes at the same level. To mitigate the suboptimality induced by the labeling of non-leaf nodes, we propose a rectification method for the loss function.
arXiv Detail & Related papers (2024-08-21T05:09:53Z)
Causal K-Means Clustering [5.087519744951637]
Causal k-Means Clustering harnesses the widely-used k-means clustering algorithm to uncover the unknown subgroup structure.<n>We present a plug-in estimator which is simple and readily implementable using off-the-shelf algorithms.<n>Our proposed methods are especially useful for modern outcome-wide studies with multiple treatment levels.
arXiv Detail & Related papers (2024-05-05T23:59:51Z)
B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under Hidden Confounding [51.74479522965712]
We propose a meta-learner called the B-Learner, which can efficiently learn sharp bounds on the CATE function under limits on hidden confounding. We prove its estimates are valid, sharp, efficient, and have a quasi-oracle property with respect to the constituent estimators under more general conditions than existing methods.
arXiv Detail & Related papers (2023-04-20T18:07:19Z)
Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics. We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data. Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z)
Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues. We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders. We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z)
Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation [75.93960390191262]
We exploit prior knowledge of the relations among object categories to cluster fine-grained classes into coarser parent classes. We propose a simple yet effective resampling method, NMS Resampling, to re-balance the data distribution. Our method, termed as Forest R-CNN, can serve as a plug-and-play module being applied to most object recognition models.
arXiv Detail & Related papers (2020-08-13T03:52:37Z)
Robust Recursive Partitioning for Heterogeneous Treatment Effects with Uncertainty Quantification [84.53697297858146]
Subgroup analysis of treatment effects plays an important role in applications from medicine to public policy to recommender systems. Most of the current methods of subgroup analysis begin with a particular algorithm for estimating individualized treatment effects (ITE) This paper develops a new method for subgroup analysis, R2P, that addresses all these weaknesses.
arXiv Detail & Related papers (2020-06-14T14:50:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.