Ensemble model for pre-discharge icd10 coding prediction
- URL: http://arxiv.org/abs/2012.11333v1
- Date: Wed, 16 Dec 2020 07:02:56 GMT
- Title: Ensemble model for pre-discharge icd10 coding prediction
- Authors: Yassien Shaalan, Alexander Dokumentov, Piyapong Khumrin, Krit
Khwanngern, Anawat Wisetborisu, Thanakom Hatsadeang, Nattapat Karaket,
Witthawin Achariyaviriya, Sansanee Auephanwiriyakul, Nipon Theera-Umpon,
Terence Siganakis
- Abstract summary: We propose an ensemble model incorporating multiple clinical data sources for accurate code predictions.
We obtain multi-label classification accuracies of 0.73 and 0.58 for average precision, 0.56 and 0.35 for F1-scores and 0.71 and 0.4 accuracy in predicting principal diagnosis for inpatient and outpatient datasets respectively.
- Score: 45.82374977939355
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The translation of medical diagnosis to clinical coding has wide range of
applications in billing, aetiology analysis, and auditing. Currently, coding is
a manual effort while the automation of such task is not straight forward.
Among the challenges are the messy and noisy clinical records, case
complexities, along with the huge ICD10 code space. Previous work mainly relied
on discharge notes for prediction and was applied to a very limited data scale.
We propose an ensemble model incorporating multiple clinical data sources for
accurate code predictions. We further propose an assessment mechanism to
provide confidence rates in predicted outcomes. Extensive experiments were
performed on two new real-world clinical datasets (inpatient & outpatient) with
unaltered case-mix distributions from Maharaj Nakorn Chiang Mai Hospital. We
obtain multi-label classification accuracies of 0.73 and 0.58 for average
precision, 0.56 and 0.35 for F1-scores and 0.71 and 0.4 accuracy in predicting
principal diagnosis for inpatient and outpatient datasets respectively.
Related papers
- Large language models are good medical coders, if provided with tools [0.0]
This study presents a novel two-stage Retrieve-Rank system for automated ICD-10-CM medical coding.
evaluating both systems on a dataset of 100 single-term medical conditions.
The Retrieve-Rank system achieved 100% accuracy in predicting correct ICD-10-CM codes.
arXiv Detail & Related papers (2024-07-06T06:58:51Z) - MedLens: Improve Mortality Prediction Via Medical Signs Selecting and
Regression [4.43322868663347]
Data-quality problem of original clinical signs is less discussed in the literature.
We designed MEDLENS, with an automatic vital medical signs selection approach via statistics and a flexible approach for high missing rate time series.
It achieves a very high accuracy performance of 0.96 AUC-ROC and 0.81 AUC-PR, which exceeds the previous benchmark.
arXiv Detail & Related papers (2023-05-19T15:28:02Z) - Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review
and Replicability Study [60.56194508762205]
We reproduce, compare, and analyze state-of-the-art automated medical coding machine learning models.
We show that several models underperform due to weak configurations, poorly sampled train-test splits, and insufficient evaluation.
We present the first comprehensive results on the newly released MIMIC-IV dataset using the reproduced models.
arXiv Detail & Related papers (2023-04-21T11:54:44Z) - Time-dependent Iterative Imputation for Multivariate Longitudinal
Clinical Data [0.0]
Time-Dependent Iterative imputation offers a practical solution for imputing time-series data.
When applied to a cohort consisting of more than 500,000 patient observations, our approach outperformed state-of-the-art imputation methods.
arXiv Detail & Related papers (2023-04-16T16:10:49Z) - Clinical Deterioration Prediction in Brazilian Hospitals Based on
Artificial Neural Networks and Tree Decision Models [56.93322937189087]
An extremely boosted neural network (XBNet) is used to predict clinical deterioration (CD)
The XGBoost model obtained the best results in predicting CD among Brazilian hospitals' data.
arXiv Detail & Related papers (2022-12-17T23:29:14Z) - TrialGraph: Machine Intelligence Enabled Insight from Graph Modelling of
Clinical Trials [0.0]
We introduce a curated clinical trial data set compiled from the CT.gov, AACT and TrialTrove databases (n=1191 trials; representing one million patients)
We then detail the mathematical basis and implementation of a selection of graph machine learning algorithms.
We trained these models to predict side effect information for a clinical trial given information on the disease, existing medical conditions, and treatment.
arXiv Detail & Related papers (2021-12-15T15:36:57Z) - Collaborative residual learners for automatic icd10 prediction using
prescribed medications [45.82374977939355]
We propose a novel collaborative residual learning based model to automatically predict ICD10 codes employing only prescriptions data.
We obtain multi-label classification accuracy of 0.71 and 0.57 of average precision, 0.57 and 0.38 of F1-score and 0.73 and 0.44 of accuracy in predicting principal diagnosis for inpatient and outpatient datasets respectively.
arXiv Detail & Related papers (2020-12-16T07:07:27Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.