MOSIC: Model-Agnostic Optimal Subgroup Identification with Multi-Constraint for Improved Reliability
- URL: http://arxiv.org/abs/2504.20908v1
- Date: Tue, 29 Apr 2025 16:25:23 GMT
- Title: MOSIC: Model-Agnostic Optimal Subgroup Identification with Multi-Constraint for Improved Reliability
- Authors: Wenxin Chen, Weishen Pan, Kyra Gan, Fei Wang,
- Abstract summary: We propose a model-agnostic framework for optimal subgroup identification under multiple constraints.<n>Our method is stable and highly flexible, supporting various models and techniques for estimating and optimizing treatment effectiveness with observational data.
- Score: 11.997050225896679
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Identifying subgroups that benefit from specific treatments using observational data is a critical challenge in personalized medicine. Most existing approaches solely focus on identifying a subgroup with an improved treatment effect. However, practical considerations, such as ensuring a minimum subgroup size for representativeness or achieving sufficient confounder balance for reliability, are also important for making findings clinically meaningful and actionable. While some studies address these constraints individually, none offer a unified approach to handle them simultaneously. To bridge this gap, we propose a model-agnostic framework for optimal subgroup identification under multiple constraints. We reformulate this combinatorial problem as an unconstrained min-max optimization problem with novel modifications and solve it by a gradient descent ascent algorithm. We further prove its convergence to a feasible and locally optimal solution. Our method is stable and highly flexible, supporting various models and techniques for estimating and optimizing treatment effectiveness with observational data. Extensive experiments on both synthetic and real-world datasets demonstrate its effectiveness in identifying subgroups that satisfy multiple constraints, achieving higher treatment effects and better confounder balancing results across different group sizes.
Related papers
- A Comparison-Relationship-Surrogate Evolutionary Algorithm for Multi-Objective Optimization [0.0]
We propose a new evolutionary algorithm "CRSEA" which uses the comparison-relationship model.<n>We find that CRSEA finds better converged solutions than the tested SAEAs on many medium-scale, biobjective problems.
arXiv Detail & Related papers (2025-04-28T01:39:38Z) - AdaptiveMDL-GenClust: A Robust Clustering Framework Integrating Normalized Mutual Information and Evolutionary Algorithms [0.0]
We introduce a robust clustering framework that integrates the Minimum Description Length (MDL) principle with a genetic optimization algorithm.<n>The framework begins with an ensemble clustering approach to generate an initial clustering solution, which is refined using MDL-guided evaluation functions and optimized through a genetic algorithm.<n> Experimental results demonstrate that our approach consistently outperforms traditional clustering methods, yielding higher accuracy, improved stability, and reduced bias.
arXiv Detail & Related papers (2024-11-26T20:26:14Z) - Federated One-Shot Ensemble Clustering [8.883940713319696]
Cluster analysis across multiple institutions poses significant challenges due to data-sharing restrictions.
We introduce the Federated One-shot Ensemble Clustering (FONT) algorithm, a novel solution tailored for multi-site analyses.
FONT requires only a single round of communication between sites and ensures privacy by exchanging only fitted model parameters and class labels.
arXiv Detail & Related papers (2024-09-12T20:55:21Z) - Counterfactual Data Augmentation with Contrastive Learning [27.28511396131235]
We introduce a model-agnostic data augmentation method that imputes the counterfactual outcomes for a selected subset of individuals.
We use contrastive learning to learn a representation space and a similarity measure such that in the learned representation space close individuals identified by the learned similarity measure have similar potential outcomes.
This property ensures reliable imputation of counterfactual outcomes for the individuals with close neighbors from the alternative treatment group.
arXiv Detail & Related papers (2023-11-07T00:36:51Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization [61.39201891894024]
Group distributionally robust optimization (group DRO) can minimize the worst-case loss over pre-defined groups.
We reformulate the group DRO framework by proposing Q-Diversity.
Characterized by an interactive training mode, Q-Diversity relaxes the group identification from annotation into direct parameterization.
arXiv Detail & Related papers (2023-05-20T07:02:27Z) - Adaptive Identification of Populations with Treatment Benefit in
Clinical Trials: Machine Learning Challenges and Solutions [78.31410227443102]
We study the problem of adaptively identifying patient subpopulations that benefit from a given treatment during a confirmatory clinical trial.
We propose AdaGGI and AdaGCPI, two meta-algorithms for subpopulation construction.
arXiv Detail & Related papers (2022-08-11T14:27:49Z) - Federated Offline Reinforcement Learning [55.326673977320574]
We propose a multi-site Markov decision process model that allows for both homogeneous and heterogeneous effects across sites.
We design the first federated policy optimization algorithm for offline RL with sample complexity.
We give a theoretical guarantee for the proposed algorithm, where the suboptimality for the learned policies is comparable to the rate as if data is not distributed.
arXiv Detail & Related papers (2022-06-11T18:03:26Z) - CAPITAL: Optimal Subgroup Identification via Constrained Policy Tree
Search [10.961093227672398]
A clinically meaningful subgroup learning approach should identify the maximum number of patients who can benefit from the better treatment.
We present an optimal subgroup selection rule (SSR) that maximizes the number of selected patients.
We propose a ConstrAined PolIcy Tree seArch aLgorithm to find the optimal SSR within the interpretable decision tree class.
arXiv Detail & Related papers (2021-10-11T22:41:07Z) - A Twin Neural Model for Uplift [59.38563723706796]
Uplift is a particular case of conditional treatment effect modeling.
We propose a new loss function defined by leveraging a connection with the Bayesian interpretation of the relative risk.
We show our proposed method is competitive with the state-of-the-art in simulation setting and on real data from large scale randomized experiments.
arXiv Detail & Related papers (2021-05-11T16:02:39Z) - Bloom Origami Assays: Practical Group Testing [90.2899558237778]
Group testing is a well-studied problem with several appealing solutions.
Recent biological studies impose practical constraints for COVID-19 that are incompatible with traditional methods.
We develop a new method combining Bloom filters with belief propagation to scale to larger values of n (more than 100) with good empirical results.
arXiv Detail & Related papers (2020-07-21T19:31:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.