Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review
and Replicability Study
- URL: http://arxiv.org/abs/2304.10909v1
- Date: Fri, 21 Apr 2023 11:54:44 GMT
- Title: Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review
and Replicability Study
- Authors: Joakim Edin, Alexander Junge, Jakob D. Havtorn, Lasse Borgholt, Maria
Maistro, Tuukka Ruotsalo, Lars Maal{\o}e
- Abstract summary: We reproduce, compare, and analyze state-of-the-art automated medical coding machine learning models.
We show that several models underperform due to weak configurations, poorly sampled train-test splits, and insufficient evaluation.
We present the first comprehensive results on the newly released MIMIC-IV dataset using the reproduced models.
- Score: 60.56194508762205
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Medical coding is the task of assigning medical codes to clinical free-text
documentation. Healthcare professionals manually assign such codes to track
patient diagnoses and treatments. Automated medical coding can considerably
alleviate this administrative burden. In this paper, we reproduce, compare, and
analyze state-of-the-art automated medical coding machine learning models. We
show that several models underperform due to weak configurations, poorly
sampled train-test splits, and insufficient evaluation. In previous work, the
macro F1 score has been calculated sub-optimally, and our correction doubles
it. We contribute a revised model comparison using stratified sampling and
identical experimental setups, including hyperparameters and decision boundary
tuning. We analyze prediction errors to validate and falsify assumptions of
previous works. The analysis confirms that all models struggle with rare codes,
while long documents only have a negligible impact. Finally, we present the
first comprehensive results on the newly released MIMIC-IV dataset using the
reproduced models. We release our code, model parameters, and new MIMIC-III and
MIMIC-IV training and evaluation pipelines to accommodate fair future
comparisons.
Related papers
- The Relevance Feature and Vector Machine for health applications [0.11538034264098687]
This paper presents a novel model that addresses the challenges of the fat-data problem when dealing with clinical prospective studies.
The model capabilities are tested against state-of-the-art models in several medical datasets with fat-data problems.
arXiv Detail & Related papers (2024-02-11T01:21:56Z) - Density-Aware Personalized Training for Risk Prediction in Imbalanced
Medical Data [89.79617468457393]
Training models with imbalance rate (class density discrepancy) may lead to suboptimal prediction.
We propose a framework for training models for this imbalance issue.
We demonstrate our model's improved performance in real-world medical datasets.
arXiv Detail & Related papers (2022-07-23T00:39:53Z) - Benchmarking AutoML Frameworks for Disease Prediction Using Medical
Claims [7.219529711278771]
We generated a large dataset using historical administrative claims including demographic information and flags for disease codes.
We trained three AutoML tools on this dataset to predict six different disease outcomes in 2019 and evaluated model performances on several metrics.
All models recorded low area under the precision-recall curve and failed to predict true positives while keeping the true negative rate high.
arXiv Detail & Related papers (2021-07-22T07:34:48Z) - Does the Magic of BERT Apply to Medical Code Assignment? A Quantitative
Study [2.871614744079523]
It is not clear if pretrained models are useful for medical code prediction without further architecture engineering.
We propose a hierarchical fine-tuning architecture to capture interactions between distant words and adopt label-wise attention to exploit label information.
Contrary to current trends, we demonstrate that a carefully trained classical CNN outperforms attention-based models on a MIMIC-III subset with frequent codes.
arXiv Detail & Related papers (2021-03-11T07:23:45Z) - EventScore: An Automated Real-time Early Warning Score for Clinical
Events [3.3039612529376625]
We build an interpretable model for the early prediction of various adverse clinical events indicative of clinical deterioration.
The model is evaluated on two datasets and four clinical events.
Our model can be entirely automated without requiring any manually recorded features.
arXiv Detail & Related papers (2021-02-11T11:55:08Z) - An Explainable CNN Approach for Medical Codes Prediction from Clinical
Text [1.7746314978241657]
We develop CNN-based methods for automatic ICD coding based on clinical text from intensive care unit (ICU) stays.
We come up with the Shallow and Wide Attention convolutional Mechanism (SWAM), which allows our model to learn local and low-level features for each label.
arXiv Detail & Related papers (2021-01-14T02:05:34Z) - BiteNet: Bidirectional Temporal Encoder Network to Predict Medical
Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey.
An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys.
We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.