Sample Size Calculations for Developing Clinical Prediction Models: Overview and pmsims R package
- URL: http://arxiv.org/abs/2602.23507v1
- Date: Thu, 26 Feb 2026 21:24:34 GMT
- Title: Sample Size Calculations for Developing Clinical Prediction Models: Overview and pmsims R package
- Authors: Diana Shamsutdinova, Felix Zimmer, Oyebayo Ridwan Olaniran, Sarah Markham, Daniel Stahl, Gordon Forbes, Ewan Carr,
- Abstract summary: Inadequate sample sizes can lead to overfitting, poor generalisability, and biased predictions.<n>Existing approaches, such as rules, closed-form formulas, and simulation-based methods, vary in flexibility and accuracy.<n>We introduce a conceptual framework that distinguishes between mean-based and assurance-based criteria.<n>We propose a novel simulation-based approach that integrates learning curves, Gaussian Process optimisation, and assurance principles.
- Score: 0.45393070061887714
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Background: Clinical prediction models are increasingly used to inform healthcare decisions, but determining the minimum sample size for their development remains a critical and unresolved challenge. Inadequate sample sizes can lead to overfitting, poor generalisability, and biased predictions. Existing approaches, such as heuristic rules, closed-form formulas, and simulation-based methods, vary in flexibility and accuracy, particularly for complex data structures and machine learning models. Methods: We review current methodologies for sample size estimation in prediction modelling and introduce a conceptual framework that distinguishes between mean-based and assurance-based criteria. Building on this, we propose a novel simulation-based approach that integrates learning curves, Gaussian Process optimisation, and assurance principles to identify sample sizes that achieve target performance with high probability. This approach is implemented in pmsims, an open-source, model-agnostic R package. Results: Through case studies, we demonstrate that sample size estimates vary substantially across methods, performance metrics, and modelling strategies. Compared to existing tools, pmsims provides flexible, efficient, and interpretable solutions that accommodate diverse models and user-defined metrics while explicitly accounting for variability in model performance. Conclusions: Our framework and software advance sample size methodology for clinical prediction modelling by combining flexibility with computational efficiency. Future work should extend these methods to hierarchical and multimodal data, incorporate fairness and stability metrics, and address challenges such as missing data and complex dependency structures.
Related papers
- Quantifying and Attributing Submodel Uncertainty in Stochastic Simulation Models and Digital Twins [0.1234398109349733]
This paper investigates how submodel uncertainty affects the estimation of system performance metrics.<n>We develop a framework for quantifying submodel uncertainty in simulation models and extend the framework to digital-twin settings.
arXiv Detail & Related papers (2026-02-18T00:06:39Z) - Model Correlation Detection via Random Selection Probing [62.093777777813756]
Existing similarity-based methods require access to model parameters or produce scores without thresholds.<n>We introduce Random Selection Probing (RSP), a hypothesis-testing framework that formulates model correlation detection as a statistical test.<n>RSP produces rigorous p-values that quantify evidence of correlation.
arXiv Detail & Related papers (2025-09-29T01:40:26Z) - Covariate-dependent Graphical Model Estimation via Neural Networks with Statistical Guarantees [18.106204331704156]
We consider settings where the graph structure is co-dependent, and investigate a deep neural network-based approach to estimate it.<n> Theoretical results with PAC guarantees are established for the method, under assumptions commonly used in an Empirical Risk Minimization framework.<n>The performance of the proposed method is evaluated on several synthetic data settings and benchmarked against existing approaches.
arXiv Detail & Related papers (2025-04-23T02:13:36Z) - Comprehensive Benchmarking of Machine Learning Methods for Risk Prediction Modelling from Large-Scale Survival Data: A UK Biobank Study [0.0]
Large-scale prospective cohort studies and a diverse toolkit of available machine learning (ML) algorithms have facilitated such survival task efforts.<n>We sought to benchmark eight distinct survival task implementations, ranging from linear to deep learning (DL) models.<n>We assessed how well different architectures scale with sample sizes ranging from n = 5,000 to n = 250,000 individuals.
arXiv Detail & Related papers (2025-03-11T20:27:20Z) - Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective.<n>The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning.<n>The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z) - Variable Importance Matching for Causal Inference [73.25504313552516]
We describe a general framework called Model-to-Match that achieves these goals.
Model-to-Match uses variable importance measurements to construct a distance metric.
We operationalize the Model-to-Match framework with LASSO.
arXiv Detail & Related papers (2023-02-23T00:43:03Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Calibrating Over-Parametrized Simulation Models: A Framework via
Eligibility Set [3.862247454265944]
We develop a framework to develop calibration schemes that satisfy rigorous frequentist statistical guarantees.
We demonstrate our methodology on several numerical examples, including an application to calibration of a limit order book market simulator.
arXiv Detail & Related papers (2021-05-27T00:59:29Z) - How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating
and Auditing Generative Models [95.8037674226622]
We introduce a 3-dimensional evaluation metric that characterizes the fidelity, diversity and generalization performance of any generative model in a domain-agnostic fashion.
Our metric unifies statistical divergence measures with precision-recall analysis, enabling sample- and distribution-level diagnoses of model fidelity and diversity.
arXiv Detail & Related papers (2021-02-17T18:25:30Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z) - Amortized Bayesian model comparison with evidential deep learning [0.12314765641075436]
We propose a novel method for performing Bayesian model comparison using specialized deep learning architectures.
Our method is purely simulation-based and circumvents the step of explicitly fitting all alternative models under consideration to each observed dataset.
We show that our method achieves excellent results in terms of accuracy, calibration, and efficiency across the examples considered in this work.
arXiv Detail & Related papers (2020-04-22T15:15:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.