Predicting Disease Progress with Imprecise Lab Test Results
- URL: http://arxiv.org/abs/2107.03620v1
- Date: Thu, 8 Jul 2021 06:03:44 GMT
- Title: Predicting Disease Progress with Imprecise Lab Test Results
- Authors: Mei Wang, Jianwen Su, Zhihua Lin
- Abstract summary: In existing deep learning methods, almost all loss functions assume that sample data values used to be predicted are the only correct ones.
We propose an imprecision range loss (IR loss) method and incorporate it into Long Short Term Memory (LSTM) model for disease progress prediction.
Experimental results on real data show that the prediction method based on IR loss can provide more stable and consistent prediction result.
- Score: 8.70310158726824
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In existing deep learning methods, almost all loss functions assume that
sample data values used to be predicted are the only correct ones. This
assumption does not hold for laboratory test data. Test results are often
within tolerable or imprecision ranges, with all values in the ranges
acceptable. By considering imprecision samples, we propose an imprecision range
loss (IR loss) method and incorporate it into Long Short Term Memory (LSTM)
model for disease progress prediction. In this method, each sample in
imprecision range space has a certain probability to be the real value,
participating in the loss calculation. The loss is defined as the integral of
the error of each point in the impression range space. A sampling method for
imprecision space is formulated. The continuous imprecision space is
discretized, and a sequence of imprecise data sets are obtained, which is
convenient for gradient descent learning. A heuristic learning algorithm is
developed to learn the model parameters based on the imprecise data sets.
Experimental results on real data show that the prediction method based on IR
loss can provide more stable and consistent prediction result when test samples
are generated from imprecision range.
Related papers
- Detecting Errors in a Numerical Response via any Regression Model [21.651775224356214]
Noise plagues many numerical datasets, where the recorded values in the data may fail to match the true underlying values.
We introduce veracity scores that distinguish between genuine errors and natural data fluctuations.
We also contribute a new error detection benchmark involving 5 regression datasets with real-world numerical errors.
arXiv Detail & Related papers (2023-05-26T02:15:26Z) - Conformal prediction for the design problem [72.14982816083297]
In many real-world deployments of machine learning, we use a prediction algorithm to choose what data to test next.
In such settings, there is a distinct type of distribution shift between the training and test data.
We introduce a method to quantify predictive uncertainty in such settings.
arXiv Detail & Related papers (2022-02-08T02:59:12Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Evaluating representations by the complexity of learning low-loss
predictors [55.94170724668857]
We consider the problem of evaluating representations of data for use in solving a downstream task.
We propose to measure the quality of a representation by the complexity of learning a predictor on top of the representation that achieves low loss on a task of interest.
arXiv Detail & Related papers (2020-09-15T22:06:58Z) - The Shooting Regressor; Randomized Gradient-Based Ensembles [0.0]
An ensemble method is introduced that utilizes randomization and loss function gradients to compute a prediction.
Multiple weakly-correlated estimators approximate the gradient at randomly sampled points on the error surface and are aggregated into a final solution.
arXiv Detail & Related papers (2020-09-14T03:20:59Z) - Impact of Medical Data Imprecision on Learning Results [9.379890125442333]
We study the impact of imprecision on prediction results in a healthcare application.
A pre-trained model is used to predict future state of hyperthyroidism for patients.
arXiv Detail & Related papers (2020-07-24T06:54:57Z) - Least Squares Estimation Using Sketched Data with Heteroskedastic Errors [0.0]
We show that estimates using data sketched by random projections will behave as if the errors were homoskedastic.
Inference, including first-stage F tests for instrument relevance, can be simpler than the full sample case if the sketching scheme is appropriately chosen.
arXiv Detail & Related papers (2020-07-15T15:58:27Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Balance-Subsampled Stable Prediction [55.13512328954456]
We propose a novel balance-subsampled stable prediction (BSSP) algorithm based on the theory of fractional factorial design.
A design-theoretic analysis shows that the proposed method can reduce the confounding effects among predictors induced by the distribution shift.
Numerical experiments on both synthetic and real-world data sets demonstrate that our BSSP algorithm significantly outperforms the baseline methods for stable prediction across unknown test data.
arXiv Detail & Related papers (2020-06-08T07:01:38Z) - A termination criterion for stochastic gradient descent for binary
classification [3.216356957608319]
We propose a new, simple, and inexpensive termination test for constant step-size gradient descent.
We show that our test terminates in a finite number of iterations and when the noise in the data is not too large, the expected classifier at termination nearly minimizes the probability of misclassification.
arXiv Detail & Related papers (2020-03-23T15:00:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.