Related papers: The hidden risks of temporal resampling in clinical reinforcement learning

The hidden risks of temporal resampling in clinical reinforcement learning

URL: http://arxiv.org/abs/2602.06603v2
Date: Tue, 10 Feb 2026 09:51:38 GMT
Title: The hidden risks of temporal resampling in clinical reinforcement learning
Authors: Thomas Frost, Hrisheekesh Vaidya, Steve Harris,
Abstract summary: We show that temporal resampling significantly degrades the performance of offline reinforcement learning algorithms during live deployment.<n>We propose three mechanisms that drive this failure: the generation of counterfactual trajectories, the distortion of temporal expectations, and the compounding of generalisation errors.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Offline reinforcement learning (ORL) has shown potential for improving decision-making in healthcare. However, contemporary research typically aggregates patient data into fixed time intervals, simplifying their mapping to standard ORL frameworks. The impact of these temporal manipulations on model safety and efficacy remains poorly understood. In this work, using both a gridworld navigation task and the UVA/Padova clinical diabetes simulator, we demonstrate that temporal resampling significantly degrades the performance of offline reinforcement learning algorithms during live deployment. We propose three mechanisms that drive this failure: (i) the generation of counterfactual trajectories, (ii) the distortion of temporal expectations, and (iii) the compounding of generalisation errors. Crucially, we find that standard off-policy evaluation metrics can fail to detect these drops in performance. Our findings reveal a fundamental risk in current healthcare ORL pipelines and emphasise the need for methods that explicitly handle the irregular timing of clinical decision-making.

Related papers

Benchmarking Early Deterioration Prediction Across Hospital-Rich and MCI-Like Emergency Triage Under Constrained Sensing [0.0]
We present a leakage-aware benchmarking framework for early deterioration prediction.<n>We compare hospital-rich triage with a vitals-only, MCI-like setting, restricting inputs to information available within the first hour of presentation.
arXiv Detail & Related papers (2026-02-09T09:32:49Z)
SurvKAN: A Fully Parametric Survival Model Based on Kolmogorov-Arnold Networks [7.352227733654751]
We introduce SurvKAN, a fully parametric, time-continuous survival model based on Kolmogorov-Arnold Networks (KANs)<n>SurvKAN treats time as an explicit input to a KAN that directly predicts the log-hazard function, enabling end-to-end training on the full survival likelihood.
arXiv Detail & Related papers (2026-02-02T14:49:14Z)
Overlap-weighted orthogonal meta-learner for treatment effect estimation over time [90.46786193198744]
We introduce a novel overlap-weighted meta-learner for estimating heterogeneous treatment effects (HTEs)<n>Our WO-learner has the favorable property of Neyman-orthogonality, meaning that it is robust against misspecification in the nuisance functions.<n>We show that our WO-learner is fully model-agnostic and can be applied to any machine learning model.
arXiv Detail & Related papers (2025-10-22T14:47:57Z)
DeltaSHAP: Explaining Prediction Evolutions in Online Patient Monitoring with Shapley Values [28.105209213061386]
This study proposes DeltaSHAP, a novel explainable artificial intelligence (XAI) algorithm specifically designed for online patient monitoring systems.<n>By adapting Shapley values to temporal settings, our approach accurately captures feature coalition effects.<n>It further attributes prediction changes using only the actually observed feature combinations, making it efficient and practical for time-sensitive clinical applications.
arXiv Detail & Related papers (2025-07-03T06:08:07Z)
CTPD: Cross-Modal Temporal Pattern Discovery for Enhanced Multimodal Electronic Health Records Analysis [50.56875995511431]
We introduce a Cross-Modal Temporal Pattern Discovery (CTPD) framework, designed to efficiently extract meaningful cross-modal temporal patterns from multimodal EHR data.<n>Our approach introduces shared initial temporal pattern representations which are refined using slot attention to generate temporal semantic embeddings.
arXiv Detail & Related papers (2024-11-01T15:54:07Z)
Temporal-Difference Variational Continual Learning [77.92320830700797]
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.<n>Our approach effectively mitigates Catastrophic Forgetting, outperforming strong Variational CL methods.
arXiv Detail & Related papers (2024-10-10T10:58:41Z)
Deep State-Space Generative Model For Correlated Time-to-Event Predictions [54.3637600983898]
We propose a deep latent state-space generative model to capture the interactions among different types of correlated clinical events. Our method also uncovers meaningful insights about the latent correlations among mortality and different types of organ failures.
arXiv Detail & Related papers (2024-07-28T02:42:36Z)
REST: Efficient and Accelerated EEG Seizure Analysis through Residual State Updates [54.96885726053036]
This paper introduces a novel graph-based residual state update mechanism (REST) for real-time EEG signal analysis. By leveraging a combination of graph neural networks and recurrent structures, REST efficiently captures both non-Euclidean geometry and temporal dependencies within EEG data. Our model demonstrates high accuracy in both seizure detection and classification tasks.
arXiv Detail & Related papers (2024-06-03T16:30:19Z)
Ambiguous Dynamic Treatment Regimes: A Reinforcement Learning Approach [0.0]
Dynamic Treatment Regimes (DTRs) are widely studied to formalize this process. We develop Reinforcement Learning methods to efficiently learn optimal treatment regimes.
arXiv Detail & Related papers (2021-12-08T20:22:04Z)
SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data [83.50281440043241]
We study the problem of inferring heterogeneous treatment effects from time-to-event data. We propose a novel deep learning method for treatment-specific hazard estimation based on balancing representations.
arXiv Detail & Related papers (2021-10-26T20:13:17Z)
CLOPS: Continual Learning of Physiological Signals [17.58391771585294]
We propose CLOPS, a replay-based continual learning strategy. We show that CLOPS can outperform the state-of-the-art methods, GEM and MIR. End-to-end trainable parameters can be used to quantify task difficulty and similarity.
arXiv Detail & Related papers (2020-04-20T19:09:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.