Unveiling the Secrets: How Masking Strategies Shape Time Series Imputation
- URL: http://arxiv.org/abs/2405.17508v2
- Date: Tue, 26 Nov 2024 13:26:58 GMT
- Title: Unveiling the Secrets: How Masking Strategies Shape Time Series Imputation
- Authors: Linglong Qian, Yiyuan Yang, Wenjie Du, Jun Wang, Zina Ibrahim,
- Abstract summary: Time series imputation is a critical challenge in data mining, particularly in domains like healthcare and environmental monitoring, where missing data can compromise analytical outcomes.
This study investigates the influence of diverse masking strategies, normalization timing, and missingness patterns on the performance of eleven state-of-the-art imputation models across three diverse datasets.
- Score: 7.650009336768971
- License:
- Abstract: Time series imputation is a critical challenge in data mining, particularly in domains like healthcare and environmental monitoring, where missing data can compromise analytical outcomes. This study investigates the influence of diverse masking strategies, normalization timing, and missingness patterns on the performance of eleven state-of-the-art imputation models across three diverse datasets. Specifically, we evaluate the effects of pre-masking versus in-mini-batch masking, augmentation versus overlaying of artificial missingness, and pre-normalization versus post-normalization. Our findings reveal that masking strategies profoundly affect imputation accuracy, with dynamic masking providing robust augmentation benefits and overlay masking better simulating real-world missingness patterns. Sophisticated models, such as CSDI, exhibited sensitivity to preprocessing configurations, while simpler models like BRITS delivered consistent and efficient performance. We highlight the importance of aligning preprocessing pipelines and masking strategies with dataset characteristics to improve robustness under diverse conditions, including high missing rates. This study provides actionable insights for designing imputation pipelines and underscores the need for transparent and comprehensive experimental designs.
Related papers
- Evidential time-to-event prediction with calibrated uncertainty quantification [12.446406577462069]
Time-to-event analysis provides insights into clinical prognosis and treatment recommendations.
We propose an evidential regression model specifically designed for time-to-event prediction.
We show that our model delivers both accurate and reliable performance, outperforming state-of-the-art methods.
arXiv Detail & Related papers (2024-11-12T15:06:04Z) - Predictive uncertainty estimation in deep learning for lung carcinoma classification in digital pathology under real dataset shifts [2.309018557701645]
This paper evaluates whether predictive uncertainty estimation adds robustness to deep learning-based diagnostic decision-making systems.
We first investigate three popular methods for improving predictive uncertainty: Monte Carlo dropout, deep ensemble, and few-shot learning on lung adenocarcinoma classification as a primary disease in whole slide images.
arXiv Detail & Related papers (2024-08-15T21:49:43Z) - How Deep is your Guess? A Fresh Perspective on Deep Learning for Medical Time-Series Imputation [6.547981908229007]
We show how architectural and framework biases combine to influence model performance.
Experiments show imputation performance variations of up to 20% based on preprocessing and implementation choices.
We identify critical gaps between current deep imputation methods and medical requirements.
arXiv Detail & Related papers (2024-07-11T12:33:28Z) - Uncertainty Quantification on Clinical Trial Outcome Prediction [37.25114005044208]
We propose incorporating uncertainty quantification into clinical trial outcome predictions.
Our main goal is to enhance the model's ability to discern nuanced differences.
We have adopted a selective classification approach to fulfill our objective.
arXiv Detail & Related papers (2024-01-07T13:48:05Z) - Benchmarking Heterogeneous Treatment Effect Models through the Lens of
Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem.
Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools.
We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z) - Disentangled Counterfactual Recurrent Networks for Treatment Effect
Inference over Time [71.30985926640659]
We introduce the Disentangled Counterfactual Recurrent Network (DCRN), a sequence-to-sequence architecture that estimates treatment outcomes over time.
With an architecture that is completely inspired by the causal structure of treatment influence over time, we advance forecast accuracy and disease understanding.
We demonstrate that DCRN outperforms current state-of-the-art methods in forecasting treatment responses, on both real and simulated data.
arXiv Detail & Related papers (2021-12-07T16:40:28Z) - What Do You See in this Patient? Behavioral Testing of Clinical NLP
Models [69.09570726777817]
We introduce an extendable testing framework that evaluates the behavior of clinical outcome models regarding changes of the input.
We show that model behavior varies drastically even when fine-tuned on the same data and that allegedly best-performing models have not always learned the most medically plausible patterns.
arXiv Detail & Related papers (2021-11-30T15:52:04Z) - Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks.
Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets.
We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z) - Bayesian prognostic covariate adjustment [59.75318183140857]
Historical data about disease outcomes can be integrated into the analysis of clinical trials in many ways.
We build on existing literature that uses prognostic scores from a predictive model to increase the efficiency of treatment effect estimates.
arXiv Detail & Related papers (2020-12-24T05:19:03Z) - Increasing the efficiency of randomized trial estimates via linear
adjustment for a prognostic score [59.75318183140857]
Estimating causal effects from randomized experiments is central to clinical research.
Most methods for historical borrowing achieve reductions in variance by sacrificing strict type-I error rate control.
arXiv Detail & Related papers (2020-12-17T21:10:10Z) - Uncertainty estimation for classification and risk prediction on medical
tabular data [0.0]
This work advances the understanding of uncertainty estimation for classification and risk prediction on medical data.
In a data-scarce field such as healthcare, the ability to measure the uncertainty of a model's prediction could potentially lead to improved effectiveness of decision support tools.
arXiv Detail & Related papers (2020-04-13T08:46:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.