Impact of Medical Data Imprecision on Learning Results
- URL: http://arxiv.org/abs/2007.12375v1
- Date: Fri, 24 Jul 2020 06:54:57 GMT
- Title: Impact of Medical Data Imprecision on Learning Results
- Authors: Mei Wang, Jianwen Su, Haiqin Lu
- Abstract summary: We study the impact of imprecision on prediction results in a healthcare application.
A pre-trained model is used to predict future state of hyperthyroidism for patients.
- Score: 9.379890125442333
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Test data measured by medical instruments often carry imprecise ranges that
include the true values. The latter are not obtainable in virtually all cases.
Most learning algorithms, however, carry out arithmetical calculations that are
subject to uncertain influence in both the learning process to obtain models
and applications of the learned models in, e.g. prediction. In this paper, we
initiate a study on the impact of imprecision on prediction results in a
healthcare application where a pre-trained model is used to predict future
state of hyperthyroidism for patients. We formulate a model for data
imprecisions. Using parameters to control the degree of imprecision, imprecise
samples for comparison experiments can be generated using this model. Further,
a group of measures are defined to evaluate the different impacts
quantitatively. More specifically, the statistics to measure the inconsistent
prediction for individual patients are defined. We perform experimental
evaluations to compare prediction results based on the data from the original
dataset and the corresponding ones generated from the proposed precision model
using the long-short-term memories (LSTM) network. The results against a real
world hyperthyroidism dataset provide insights into how small imprecisions can
cause large ranges of predicted results, which could cause mis-labeling and
inappropriate actions (treatments or no treatments) for individual patients.
Related papers
- SepsisLab: Early Sepsis Prediction with Uncertainty Quantification and Active Sensing [67.8991481023825]
Sepsis is the leading cause of in-hospital mortality in the USA.
Existing predictive models are usually trained on high-quality data with few missing information.
For the potential high-risk patients with low confidence due to limited observations, we propose a robust active sensing algorithm.
arXiv Detail & Related papers (2024-07-24T04:47:36Z) - A Machine Learning Model for Predicting, Diagnosing, and Mitigating
Health Disparities in Hospital Readmission [0.0]
We propose a machine learning pipeline capable of making predictions as well as detecting and mitigating biases in the data and model predictions.
We evaluate the performance of the proposed method on a clinical dataset using accuracy and fairness measures.
arXiv Detail & Related papers (2022-06-13T16:07:25Z) - To Impute or not to Impute? -- Missing Data in Treatment Effect
Estimation [84.76186111434818]
We identify a new missingness mechanism, which we term mixed confounded missingness (MCM), where some missingness determines treatment selection and other missingness is determined by treatment selection.
We show that naively imputing all data leads to poor performing treatment effects models, as the act of imputation effectively removes information necessary to provide unbiased estimates.
Our solution is selective imputation, where we use insights from MCM to inform precisely which variables should be imputed and which should not.
arXiv Detail & Related papers (2022-02-04T12:08:31Z) - Statistical quantification of confounding bias in predictive modelling [0.0]
I propose the partial and full confounder tests, which probe the null hypotheses of unconfounded and fully confounded models.
The tests provide a strict control for Type I errors and high statistical power, even for non-normally and non-linearly dependent predictions.
arXiv Detail & Related papers (2021-11-01T10:35:24Z) - Increasing the efficiency of randomized trial estimates via linear
adjustment for a prognostic score [59.75318183140857]
Estimating causal effects from randomized experiments is central to clinical research.
Most methods for historical borrowing achieve reductions in variance by sacrificing strict type-I error rate control.
arXiv Detail & Related papers (2020-12-17T21:10:10Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - Performance metrics for intervention-triggering prediction models do not
reflect an expected reduction in outcomes from using the model [71.9860741092209]
Clinical researchers often select among and evaluate risk prediction models.
Standard metrics calculated from retrospective data are only related to model utility under certain assumptions.
When predictions are delivered repeatedly throughout time, the relationship between standard metrics and utility is further complicated.
arXiv Detail & Related papers (2020-06-02T16:26:49Z) - Hemogram Data as a Tool for Decision-making in COVID-19 Management:
Applications to Resource Scarcity Scenarios [62.997667081978825]
COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure.
This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients.
Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity.
arXiv Detail & Related papers (2020-05-10T01:45:03Z) - Uncertainty estimation for classification and risk prediction on medical
tabular data [0.0]
This work advances the understanding of uncertainty estimation for classification and risk prediction on medical data.
In a data-scarce field such as healthcare, the ability to measure the uncertainty of a model's prediction could potentially lead to improved effectiveness of decision support tools.
arXiv Detail & Related papers (2020-04-13T08:46:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.