Bayesian Hybrid Machine Learning of Gallstone Risk
- URL: http://arxiv.org/abs/2506.14561v1
- Date: Tue, 17 Jun 2025 14:19:02 GMT
- Title: Bayesian Hybrid Machine Learning of Gallstone Risk
- Authors: Chitradipa Chakraborty, Nayana Mukherjee,
- Abstract summary: Gallstone disease is a complex, multifactorial condition with significant global health burdens.<n>We propose a hybrid machine learning framework that integrates robust variable selection with advanced interaction detection.<n>This proposed framework not only enhances prediction but also yields actionable insights, offering a valuable support tool for medical research and decision-making.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gallstone disease is a complex, multifactorial condition with significant global health burdens. Identifying underlying risk factors and their interactions is crucial for early diagnosis, targeted prevention, and effective clinical management. Although logistic regression remains a standard tool for assessing associations between predictors and gallstone status, it often underperforms in high-dimensional settings and may fail to capture intricate relationships among variables. To address these limitations, we propose a hybrid machine learning framework that integrates robust variable selection with advanced interaction detection. Specifically, Adaptive LASSO is employed to identify a sparse and interpretable subset of influential features, followed by Bayesian Additive Regression Trees (BART) to model nonlinear effects and uncover key interactions. Selected interactions are further characterized by physiological knowledge through differential equation-informed interaction terms, grounding the model in biologically plausible mechanisms. The insights gained from these steps are then integrated into a final logistic regression model within a Bayesian framework, providing a balance between predictive accuracy and clinical interpretability. This proposed framework not only enhances prediction but also yields actionable insights, offering a valuable support tool for medical research and decision-making.
Related papers
- Predictive Causal Inference via Spatio-Temporal Modeling and Penalized Empirical Likelihood [0.0]
This study introduces an integrated framework for predictive causal inference designed to overcome limitations in conventional single model approaches.<n> Specifically, we combine a Hidden Markov Model for spatial health state estimation with a Multi Task and Multi Graph Convolutional Network (MTGCN) for capturing temporal outcome trajectories.<n>To demonstrate its utility, we focus on clinical domains such as cancer, dementia, Parkinson disease, where treatment effects are challenging to observe directly.
arXiv Detail & Related papers (2025-07-11T03:11:15Z) - Information-theoretic Quantification of High-order Feature Effects in Classification Problems [0.19791587637442676]
We present an information-theoretic extension of the High-order interactions for Feature importance (Hi-Fi) method.<n>Our framework decomposes feature contributions into unique, synergistic, and redundant components.<n>Results indicate that the proposed estimator accurately recovers theoretical and expected findings.
arXiv Detail & Related papers (2025-07-06T11:50:30Z) - A Symbolic and Statistical Learning Framework to Discover Bioprocessing Regulatory Mechanism: Cell Culture Example [2.325005809983534]
This paper introduces a symbolic and statistical learning framework to identify key regulatory mechanisms and model uncertainty.<n>A Metropolis-adjusted Langevin algorithm with adjoint sensitivity analysis is developed for posterior exploration.<n>An empirical study demonstrates its ability to recover missing regulatory mechanisms and improve model fidelity under datalimited conditions.
arXiv Detail & Related papers (2025-05-06T04:39:34Z) - Statistical Learning for Heterogeneous Treatment Effects: Pretraining, Prognosis, and Prediction [40.96453902709292]
We propose pretraining strategies that leverage a phenomenon in real-world applications.<n>In medicine, components of the same biological signaling pathways frequently influence both baseline risk and treatment response.<n>We use this structure to incorporate "side information" and develop models that can exploit synergies between risk prediction and causal effect estimation.
arXiv Detail & Related papers (2025-05-01T05:12:14Z) - Integrating Probabilistic Trees and Causal Networks for Clinical and Epidemiological Data [18.539194412343104]
This study introduces the Probabilistic Causal Fusion (PCF) framework.<n>PCF integrates Causal Bayesian Networks (CBNs) and Probability Trees (PTrees) to extend beyond predictions.<n>It was validated on three real-world healthcare datasets.
arXiv Detail & Related papers (2025-01-27T11:34:19Z) - Causal Representation Learning from Multimodal Biomedical Observations [57.00712157758845]
We develop flexible identification conditions for multimodal data and principled methods to facilitate the understanding of biomedical datasets.<n>Key theoretical contribution is the structural sparsity of causal connections between modalities.<n>Results on a real-world human phenotype dataset are consistent with established biomedical research.
arXiv Detail & Related papers (2024-11-10T16:40:27Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - Learning to Denoise Biomedical Knowledge Graph for Robust Molecular Interaction Prediction [50.7901190642594]
We propose BioKDN (Biomedical Knowledge Graph Denoising Network) for robust molecular interaction prediction.
BioKDN refines the reliable structure of local subgraphs by denoising noisy links in a learnable manner.
It maintains consistent and robust semantics by smoothing relations around the target interaction.
arXiv Detail & Related papers (2023-12-09T07:08:00Z) - Causal Inference via Nonlinear Variable Decorrelation for Healthcare
Applications [60.26261850082012]
We introduce a novel method with a variable decorrelation regularizer to handle both linear and nonlinear confounding.
We employ association rules as new representations using association rule mining based on the original features to increase model interpretability.
arXiv Detail & Related papers (2022-09-29T17:44:14Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - Tree-Guided Rare Feature Selection and Logic Aggregation with Electronic
Health Records Data [7.422597776308963]
We propose a tree-guided feature selection and logic aggregation approach for large-scale regression with rare binary features.
In a suicide risk study with EHR data, our approach is able to select and aggregate prior mental health diagnoses.
arXiv Detail & Related papers (2022-06-18T03:52:43Z) - Benchmarking Heterogeneous Treatment Effect Models through the Lens of
Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem.
Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools.
We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.