Related papers: Read, Attend, and Code: Pushing the Limits of Medical Codes Prediction from Clinical Notes by Machines

Read, Attend, and Code: Pushing the Limits of Medical Codes Prediction from Clinical Notes by Machines

URL: http://arxiv.org/abs/2107.10650v1
Date: Sat, 10 Jul 2021 06:01:58 GMT
Title: Read, Attend, and Code: Pushing the Limits of Medical Codes Prediction from Clinical Notes by Machines
Authors: Byung-Hak Kim and Varun Ganapathi
Abstract summary: We present our Read, Attend, and Code (RAC) model for learning the medical code assignment mappings. RAC establishes a new state of the art (SOTA) considerably outperforming the current best Macro-F1 by 18.7%. This new milestone marks a meaningful step toward fully autonomous medical coding (AMC) in machines.
Score: 0.42641920138420947
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Prediction of medical codes from clinical notes is both a practical and essential need for every healthcare delivery organization within current medical systems. Automating annotation will save significant time and excessive effort spent by human coders today. However, the biggest challenge is directly identifying appropriate medical codes out of several thousands of high-dimensional codes from unstructured free-text clinical notes. In the past three years, with Convolutional Neural Networks (CNN) and Long Short-Term Memory (LTSM) networks, there have been vast improvements in tackling the most challenging benchmark of the MIMIC-III-full-label inpatient clinical notes dataset. This progress raises the fundamental question of how far automated machine learning (ML) systems are from human coders' working performance. We assessed the baseline of human coders' performance on the same subsampled testing set. We also present our Read, Attend, and Code (RAC) model for learning the medical code assignment mappings. By connecting convolved embeddings with self-attention and code-title guided attention modules, combined with sentence permutation-based data augmentations and stochastic weight averaging training, RAC establishes a new state of the art (SOTA), considerably outperforming the current best Macro-F1 by 18.7%, and reaches past the human-level coding baseline. This new milestone marks a meaningful step toward fully autonomous medical coding (AMC) in machines reaching parity with human coders' performance in medical code prediction.

Related papers

Medchain: Bridging the Gap Between LLM Agents and Clinical Practice through Interactive Sequential Benchmarking [58.25862290294702]
We present MedChain, a dataset of 12,163 clinical cases that covers five key stages of clinical workflow. We also propose MedChain-Agent, an AI system that integrates a feedback mechanism and a MCase-RAG module to learn from previous cases and adapt its responses.
arXiv Detail & Related papers (2024-12-02T15:25:02Z)
MedCodER: A Generative AI Assistant for Medical Coding [3.7153274758003967]
We introduce MedCodER, a Generative AI framework for automatic medical coding. MedCodER achieves a micro-F1 score of 0.60 on International Classification of Diseases (ICD) code prediction. We present a new dataset containing medical records annotated with disease diagnoses, ICD codes, and supporting evidence texts.
arXiv Detail & Related papers (2024-09-18T19:36:33Z)
LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets. We have collected approximately 1.3 million medical images from 55 publicly available datasets. LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z)
Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review and Replicability Study [60.56194508762205]
We reproduce, compare, and analyze state-of-the-art automated medical coding machine learning models. We show that several models underperform due to weak configurations, poorly sampled train-test splits, and insufficient evaluation. We present the first comprehensive results on the newly released MIMIC-IV dataset using the reproduced models.
arXiv Detail & Related papers (2023-04-21T11:54:44Z)
Medical Codes Prediction from Clinical Notes: From Human Coders to Machines [0.21320960069210473]
Prediction of medical codes from clinical notes is a practical and essential need for every healthcare delivery organization. The biggest challenge is directly identifying appropriate medical codes from several thousands of high-dimensional codes from unstructured free-text clinical notes. Recent studies have shown the state-of-the-art code prediction results of full-fledged deep learning-based methods.
arXiv Detail & Related papers (2022-10-30T14:24:13Z)
Can Current Explainability Help Provide References in Clinical Notes to Support Humans Annotate Medical Codes? [53.45585591262433]
We present an explainable Read, Attend, and Code (xRAC) framework and assess two approaches, attention score-based xRAC-ATTN and model-agnostic knowledge-distillation-based xRAC-KD. We find that the supporting evidence text highlighted by xRAC-ATTN is of higher quality than xRAC-KD whereas xRAC-KD has potential advantages in production deployment scenarios.
arXiv Detail & Related papers (2022-10-28T04:06:07Z)
The Medkit-Learn(ing) Environment: Medical Decision Modelling through Simulation [81.72197368690031]
We present a new benchmarking suite designed specifically for medical sequential decision making. The Medkit-Learn(ing) Environment is a publicly available Python package providing simple and easy access to high-fidelity synthetic medical data.
arXiv Detail & Related papers (2021-06-08T10:38:09Z)
Multitask Recalibrated Aggregation Network for Medical Code Prediction [19.330911490203317]
We propose a multitask recalibrated aggregation network to solve the challenges of encoding lengthy and noisy clinical documents. In particular, multitask learning shares information across different coding schemes and captures the dependencies between different medical codes. Experiments with a real-world MIMIC-III dataset show significantly improved predictive performance.
arXiv Detail & Related papers (2021-04-02T09:22:10Z)
TransICD: Transformer Based Code-wise Attention Model for Explainable ICD Coding [5.273190477622007]
International Classification of Disease (ICD) coding procedure has been shown to be effective and crucial to the billing system in medical sector. Currently, ICD codes are assigned to a clinical note manually which is likely to cause many errors. In this project, we apply a transformer-based architecture to capture the interdependence among the tokens of a document and then use a code-wise attention mechanism to learn code-specific representations of the entire document.
arXiv Detail & Related papers (2021-03-28T05:34:32Z)
A Meta-embedding-based Ensemble Approach for ICD Coding Prediction [64.42386426730695]
International Classification of Diseases (ICD) are the de facto codes used globally for clinical coding. These codes enable healthcare providers to claim reimbursement and facilitate efficient storage and retrieval of diagnostic information. Our proposed approach enhances the performance of neural models by effectively training word vectors using routine medical data as well as external knowledge from scientific articles.
arXiv Detail & Related papers (2021-02-26T17:49:58Z)
Medical Code Assignment with Gated Convolution and Note-Code Interaction [39.079615516043674]
We propose a novel method, gated convolutional neural networks, and a note-code interaction (GatedCNN-NCI) for automatic medical code assignment. With a novel note-code interaction design and a graph message passing mechanism, we explicitly capture the underlying dependency between notes and codes. Our proposed model outperforms state-of-the-art models in most cases, and our model size is on par with light-weighted baselines.
arXiv Detail & Related papers (2020-10-14T11:37:24Z)
BiteNet: Bidirectional Temporal Encoder Network to Predict Medical Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey. An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys. We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.