Towards symbolic regression for interpretable clinical decision scores
- URL: http://arxiv.org/abs/2512.07961v1
- Date: Mon, 08 Dec 2025 19:00:41 GMT
- Title: Towards symbolic regression for interpretable clinical decision scores
- Authors: Guilherme Seidyo Imai Aldeia, Joseph D. Romano, Fabricio Olivetti de Franca, Daniel S. Herman, William G. La Cava,
- Abstract summary: Brush is an SR algorithm that combines decision-tree-like splitting algorithms with non-linear constant optimization.<n>It is applied to recapitulate two widely used clinical scoring systems, achieving high accuracy and interpretable models.
- Score: 2.1258866329463824
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Medical decision-making makes frequent use of algorithms that combine risk equations with rules, providing clear and standardized treatment pathways. Symbolic regression (SR) traditionally limits its search space to continuous function forms and their parameters, making it difficult to model this decision-making. However, due to its ability to derive data-driven, interpretable models, SR holds promise for developing data-driven clinical risk scores. To that end we introduce Brush, an SR algorithm that combines decision-tree-like splitting algorithms with non-linear constant optimization, allowing for seamless integration of rule-based logic into symbolic regression and classification models. Brush achieves Pareto-optimal performance on SRBench, and was applied to recapitulate two widely used clinical scoring systems, achieving high accuracy and interpretable models. Compared to decision trees, random forests, and other SR methods, Brush achieves comparable or superior predictive performance while producing simpler models.
Related papers
- ODELoRA: Training Low-Rank Adaptation by Solving Ordinary Differential Equations [54.886931928255564]
Low-rank adaptation (LoRA) has emerged as a widely adopted parameter-efficient fine-tuning method in deep transfer learning.<n>We propose a novel continuous-time optimization dynamic for LoRA factor matrices in the form of an ordinary differential equation (ODE)<n>We show that ODELoRA achieves stable feature learning, a property that is crucial for training deep neural networks at different scales of problem dimensionality.
arXiv Detail & Related papers (2026-02-07T10:19:36Z) - Current Challenges of Symbolic Regression: Optimization, Selection, Model Simplification, and Benchmarking [0.0]
Symbolic Regression (SR) aims to discover mathematical expressions that describe the relationship between variables.<n>Current methods must be constantly re-evaluated to understand the SR landscape.<n>This thesis addresses these challenges through a sequence of studies conducted throughout the doctorate.
arXiv Detail & Related papers (2025-12-01T13:48:07Z) - Efficient Group Lasso Regularized Rank Regression with Data-Driven Parameter Determination [2.847099287022546]
High-dimensional regression often suffers from heavy-tailed noise and outliers, which can severely undermine the reliability of least-squares based methods.<n>To improve robustness, we adopt a non-smooth Wilcoxon score based rank objective and incorporate structured group sparsity regularization.<n>We also introduce a data-driven, simulation-based tuning rule and further establish a finite-sample error bound for the resulting estimator.
arXiv Detail & Related papers (2025-10-13T15:45:58Z) - Adaptive Learning-based Surrogate Method for Stochastic Programs with Implicitly Decision-dependent Uncertainty [1.5412450351033007]
We consider a class of programming problems where the implicitly decision-dependent random variable follows a nonparametric regression model with heteroscedastic error.<n>We develop an adaptive learning-based surrogate method that integrates the simulation scheme and statistical estimates to construct estimation-based surrogate functions.
arXiv Detail & Related papers (2025-05-12T07:35:06Z) - Conditional Mean and Variance Estimation via \textit{k}-NN Algorithm with Automated Variance Selection [9.943131787772323]
We introduce a novel textitk-nearest neighbor (textitk-NN) regression method for joint estimation of the conditional mean and variance.<n>The proposed algorithm preserves the computational efficiency and manifold-learning capabilities of classical non-parametric textitk-NN models.
arXiv Detail & Related papers (2024-02-02T18:54:18Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Provably Efficient UCB-type Algorithms For Learning Predictive State
Representations [55.00359893021461]
The sequential decision-making problem is statistically learnable if it admits a low-rank structure modeled by predictive state representations (PSRs)
This paper proposes the first known UCB-type approach for PSRs, featuring a novel bonus term that upper bounds the total variation distance between the estimated and true models.
In contrast to existing approaches for PSRs, our UCB-type algorithms enjoy computational tractability, last-iterate guaranteed near-optimal policy, and guaranteed model accuracy.
arXiv Detail & Related papers (2023-07-01T18:35:21Z) - GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP,
and Beyond [101.5329678997916]
We study sample efficient reinforcement learning (RL) under the general framework of interactive decision making.
We propose a novel complexity measure, generalized eluder coefficient (GEC), which characterizes the fundamental tradeoff between exploration and exploitation.
We show that RL problems with low GEC form a remarkably rich class, which subsumes low Bellman eluder dimension problems, bilinear class, low witness rank problems, PO-bilinear class, and generalized regular PSR.
arXiv Detail & Related papers (2022-11-03T16:42:40Z) - Deep Learning Aided Laplace Based Bayesian Inference for Epidemiological
Systems [2.596903831934905]
We propose a hybrid approach where Laplace-based Bayesian inference is combined with an ANN architecture for obtaining approximations to the ODE trajectories.
The effectiveness of our proposed methods is demonstrated using an epidemiological system with non-analytical solutions, the Susceptible-Infectious-Removed (SIR) model for infectious diseases.
arXiv Detail & Related papers (2022-10-17T09:02:41Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Estimation of Switched Markov Polynomial NARX models [75.91002178647165]
We identify a class of models for hybrid dynamical systems characterized by nonlinear autoregressive (NARX) components.
The proposed approach is demonstrated on a SMNARX problem composed by three nonlinear sub-models with specific regressors.
arXiv Detail & Related papers (2020-09-29T15:00:47Z) - Adaptive Correlated Monte Carlo for Contextual Categorical Sequence
Generation [77.7420231319632]
We adapt contextual generation of categorical sequences to a policy gradient estimator, which evaluates a set of correlated Monte Carlo (MC) rollouts for variance control.
We also demonstrate the use of correlated MC rollouts for binary-tree softmax models, which reduce the high generation cost in large vocabulary scenarios.
arXiv Detail & Related papers (2019-12-31T03:01:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.