Efficient Subgroup Analysis via Optimal Trees with Global Parameter Fusion
- URL: http://arxiv.org/abs/2602.04077v1
- Date: Tue, 03 Feb 2026 23:26:19 GMT
- Title: Efficient Subgroup Analysis via Optimal Trees with Global Parameter Fusion
- Authors: Zhongming Xie, Joseph Giorgio, Jingshen Wang,
- Abstract summary: Subgroup analysis allows practitioners to pinpoint populations for whom a treatment is especially beneficial or protective.<n>We propose a fused optimal causal tree method that leverages mixed integer optimization (MIO) to facilitate precise subgroup identification.<n>We provide theoretical guarantees by rigorously establishing out of sample risk bounds and comparing them with those of classical tree based methods.
- Score: 4.874780144224057
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Identifying and making statistical inferences on differential treatment effects (commonly known as subgroup analysis in clinical research) is central to precision health. Subgroup analysis allows practitioners to pinpoint populations for whom a treatment is especially beneficial or protective, thereby advancing targeted interventions. Tree based recursive partitioning methods are widely used for subgroup analysis due to their interpretability. Nevertheless, these approaches encounter significant limitations, including suboptimal partitions induced by greedy heuristics and overfitting from locally estimated splits, especially under limited sample sizes. To address these limitations, we propose a fused optimal causal tree method that leverages mixed integer optimization (MIO) to facilitate precise subgroup identification. Our approach ensures globally optimal partitions and introduces a parameter fusion constraint to facilitate information sharing across related subgroups. This design substantially improves subgroup discovery accuracy and enhances statistical efficiency. We provide theoretical guarantees by rigorously establishing out of sample risk bounds and comparing them with those of classical tree based methods. Empirically, our method consistently outperforms popular baselines in simulations. Finally, we demonstrate its practical utility through a case study on the Health and Aging Brain Study Health Disparities (HABS-HD) dataset, where our approach yields clinically meaningful insights.
Related papers
- Learning Subgroups with Maximum Treatment Effects without Causal Heuristics [16.087398572596587]
We show that optimal subgroup discovery reduces to recovering the data-generating models and hence a standard supervised learning problem.<n>We instantiate the approach with CART, arguably one of the most widely used tree-based methods, to learn the subgroup with maximum treatment effect.
arXiv Detail & Related papers (2025-11-25T11:13:05Z) - Causal Clustering for Conditional Average Treatment Effects Estimation and Subgroup Discovery [5.669361767058639]
Estimating heterogeneous treatment effects is critical in domains such as personalized medicine, resource allocation, and policy evaluation.<n>We propose a novel framework that clusters individuals based on estimated treatment effects using a learned kernel derived from causal forests.
arXiv Detail & Related papers (2025-09-06T17:01:23Z) - MOSIC: Model-Agnostic Optimal Subgroup Identification with Multi-Constraint for Improved Reliability [11.997050225896679]
We propose a unified optimization framework that directly solves the primal constrained optimization problem to identify optimal subgroups.<n>Our key innovation is a reformulation of the constrained primal problem as an unconstrained differentiable min-max objective, solved via a gradient descent-ascent algorithm.<n>The framework is model-agnostic, compatible with a wide range of CATE estimators, and propensity to additional constraints like cost limits or fairness criteria.
arXiv Detail & Related papers (2025-04-29T16:25:23Z) - WHOMP: Optimizing Randomized Controlled Trials via Wasserstein Homogeneity [3.05179671246628]
We introduce a novel partitioning method called the $textitWasserstein Homogeneity Partition$ (WHOMP)
WHOMP optimally minimizes type I and type II errors that often result from imbalanced group splitting or partitioning.
arXiv Detail & Related papers (2024-09-27T07:38:47Z) - Rethinking Clustered Federated Learning in NOMA Enhanced Wireless
Networks [60.09912912343705]
This study explores the benefits of integrating the novel clustered federated learning (CFL) approach with non-independent and identically distributed (non-IID) datasets.
A detailed theoretical analysis of the generalization gap that measures the degree of non-IID in the data distribution is presented.
Solutions to address the challenges posed by non-IID conditions are proposed with the analysis of the properties.
arXiv Detail & Related papers (2024-03-05T17:49:09Z) - Synergistic eigenanalysis of covariance and Hessian matrices for enhanced binary classification [72.77513633290056]
We present a novel approach that combines the eigenanalysis of a covariance matrix evaluated on a training set with a Hessian matrix evaluated on a deep learning model.
Our method captures intricate patterns and relationships, enhancing classification performance.
arXiv Detail & Related papers (2024-02-14T16:10:42Z) - A structured regression approach for evaluating model performance across intersectional subgroups [53.91682617836498]
Disaggregated evaluation is a central task in AI fairness assessment, where the goal is to measure an AI system's performance across different subgroups.
We introduce a structured regression approach to disaggregated evaluation that we demonstrate can yield reliable system performance estimates even for very small subgroups.
arXiv Detail & Related papers (2024-01-26T14:21:45Z) - Adaptive Identification of Populations with Treatment Benefit in
Clinical Trials: Machine Learning Challenges and Solutions [78.31410227443102]
We study the problem of adaptively identifying patient subpopulations that benefit from a given treatment during a confirmatory clinical trial.
We propose AdaGGI and AdaGCPI, two meta-algorithms for subpopulation construction.
arXiv Detail & Related papers (2022-08-11T14:27:49Z) - CAPITAL: Optimal Subgroup Identification via Constrained Policy Tree
Search [10.961093227672398]
A clinically meaningful subgroup learning approach should identify the maximum number of patients who can benefit from the better treatment.
We present an optimal subgroup selection rule (SSR) that maximizes the number of selected patients.
We propose a ConstrAined PolIcy Tree seArch aLgorithm to find the optimal SSR within the interpretable decision tree class.
arXiv Detail & Related papers (2021-10-11T22:41:07Z) - Harnessing Heterogeneity: Learning from Decomposed Feedback in Bayesian
Modeling [68.69431580852535]
We introduce a novel GP regression to incorporate the subgroup feedback.
Our modified regression has provably lower variance -- and thus a more accurate posterior -- compared to previous approaches.
We execute our algorithm on two disparate social problems.
arXiv Detail & Related papers (2021-07-07T03:57:22Z) - Structured Sparsity Inducing Adaptive Optimizers for Deep Learning [94.23102887731417]
In this paper, we derive the weighted proximal operator, which is a necessary component of proximal gradient methods.
We show that this adaptive method, together with the weighted proximal operators derived here, is indeed capable of finding solutions with structure in their sparsity patterns.
arXiv Detail & Related papers (2021-02-07T18:06:23Z) - Robust Recursive Partitioning for Heterogeneous Treatment Effects with
Uncertainty Quantification [84.53697297858146]
Subgroup analysis of treatment effects plays an important role in applications from medicine to public policy to recommender systems.
Most of the current methods of subgroup analysis begin with a particular algorithm for estimating individualized treatment effects (ITE)
This paper develops a new method for subgroup analysis, R2P, that addresses all these weaknesses.
arXiv Detail & Related papers (2020-06-14T14:50:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.