A Comparative Study on Automatic Coding of Medical Letters with Explainability
- URL: http://arxiv.org/abs/2407.13638v1
- Date: Thu, 18 Jul 2024 16:12:47 GMT
- Title: A Comparative Study on Automatic Coding of Medical Letters with Explainability
- Authors: Jamie Glen, Lifeng Han, Paul Rayson, Goran Nenadic,
- Abstract summary: This study aims to explore the implementation of Natural Language Processing (NLP) and machine learning (ML) techniques to automate the coding of medical letters.
We used the publicly available MIMIC-III database and the HAN/HLAN network models for ICD code prediction purposes.
In our experiments, the models provided useful information for 97.98% of codes.
- Score: 7.834930446531957
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This study aims to explore the implementation of Natural Language Processing (NLP) and machine learning (ML) techniques to automate the coding of medical letters with visualised explainability and light-weighted local computer settings. Currently in clinical settings, coding is a manual process that involves assigning codes to each condition, procedure, and medication in a patient's paperwork (e.g., 56265001 heart disease using SNOMED CT code). There are preliminary research on automatic coding in this field using state-of-the-art ML models; however, due to the complexity and size of the models, the real-world deployment is not achieved. To further facilitate the possibility of automatic coding practice, we explore some solutions in a local computer setting; in addition, we explore the function of explainability for transparency of AI models. We used the publicly available MIMIC-III database and the HAN/HLAN network models for ICD code prediction purposes. We also experimented with the mapping between ICD and SNOMED CT knowledge bases. In our experiments, the models provided useful information for 97.98\% of codes. The result of this investigation can shed some light on implementing automatic clinical coding in practice, such as in hospital settings, on the local computers used by clinicians , project page \url{https://github.com/Glenj01/Medical-Coding}.
Related papers
- Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding [53.629132242389716]
Vision-Language Models (VLM) can support clinicians by analyzing medical images and engaging in natural language interactions.
VLMs often exhibit "hallucinogenic" behavior, generating textual outputs not grounded in contextual multimodal information.
We propose a new alignment algorithm that uses symbolic representations of clinical reasoning to ground VLMs in medical knowledge.
arXiv Detail & Related papers (2024-05-29T23:19:28Z) - CoRelation: Boosting Automatic ICD Coding Through Contextualized Code
Relation Learning [56.782963838838036]
We propose a novel approach, a contextualized and flexible framework, to enhance the learning of ICD code representations.
Our approach employs a dependent learning paradigm that considers the context of clinical notes in modeling all possible code relations.
arXiv Detail & Related papers (2024-02-24T03:25:28Z) - Automatic Coding at Scale: Design and Deployment of a Nationwide System
for Normalizing Referrals in the Chilean Public Healthcare System [0.0]
We propose a two-step system for automatically coding diseases in referrals from the Chilean public healthcare system.
Specifically, our model uses a state-of-the-art NER model for recognizing disease mentions and a search engine system based on for assigning the most relevant codes associated with these disease mentions.
Our system obtained a MAP score of 0.63 for the subcategory level and 0.83 for the category level, close to the best-performing models in the literature.
arXiv Detail & Related papers (2023-07-09T16:19:35Z) - PyTrial: Machine Learning Software and Benchmark for Clinical Trial
Applications [49.69824178329405]
PyTrial provides benchmarks and open-source implementations of a series of machine learning algorithms for clinical trial design and operations.
We thoroughly investigate 34 ML algorithms for clinical trials across 6 different tasks, including patient outcome prediction, trial site selection, trial outcome prediction, patient-trial matching, trial similarity search, and synthetic data generation.
PyTrial defines each task through a simple four-step process: data loading, model specification, model training, and model evaluation, all achievable with just a few lines of code.
arXiv Detail & Related papers (2023-06-06T21:19:03Z) - Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review
and Replicability Study [60.56194508762205]
We reproduce, compare, and analyze state-of-the-art automated medical coding machine learning models.
We show that several models underperform due to weak configurations, poorly sampled train-test splits, and insufficient evaluation.
We present the first comprehensive results on the newly released MIMIC-IV dataset using the reproduced models.
arXiv Detail & Related papers (2023-04-21T11:54:44Z) - HiCu: Leveraging Hierarchy for Curriculum Learning in Automated ICD
Coding [2.274915755738124]
We create curricula for multi-label classification models that predict ICD diagnosis and procedure codes from natural language descriptions.
Our proposed curricula improve the generalization of neural network-based predictive models across recurrent, convolutional, and transformer-based architectures.
arXiv Detail & Related papers (2022-08-03T18:39:27Z) - GrabQC: Graph based Query Contextualization for automated ICD coding [16.096824533334352]
We propose textbfGrabQC, a textbfGraph textbfbased textbfQuery textbfContextualization method that automatically extracts queries from the clinical text.
We perform experiments on two datasets of clinical text in three different setups to assert the effectiveness of our approach.
arXiv Detail & Related papers (2022-07-14T10:27:25Z) - Read, Attend, and Code: Pushing the Limits of Medical Codes Prediction
from Clinical Notes by Machines [0.42641920138420947]
We present our Read, Attend, and Code (RAC) model for learning the medical code assignment mappings.
RAC establishes a new state of the art (SOTA) considerably outperforming the current best Macro-F1 by 18.7%.
This new milestone marks a meaningful step toward fully autonomous medical coding (AMC) in machines.
arXiv Detail & Related papers (2021-07-10T06:01:58Z) - Active learning for medical code assignment [55.99831806138029]
We demonstrate the effectiveness of Active Learning (AL) in multi-label text classification in the clinical domain.
We apply a set of well-known AL methods to help automatically assign ICD-9 codes on the MIMIC-III dataset.
Our results show that the selection of informative instances provides satisfactory classification with a significantly reduced training set.
arXiv Detail & Related papers (2021-04-12T18:11:17Z) - TransICD: Transformer Based Code-wise Attention Model for Explainable
ICD Coding [5.273190477622007]
International Classification of Disease (ICD) coding procedure has been shown to be effective and crucial to the billing system in medical sector.
Currently, ICD codes are assigned to a clinical note manually which is likely to cause many errors.
In this project, we apply a transformer-based architecture to capture the interdependence among the tokens of a document and then use a code-wise attention mechanism to learn code-specific representations of the entire document.
arXiv Detail & Related papers (2021-03-28T05:34:32Z) - A Meta-embedding-based Ensemble Approach for ICD Coding Prediction [64.42386426730695]
International Classification of Diseases (ICD) are the de facto codes used globally for clinical coding.
These codes enable healthcare providers to claim reimbursement and facilitate efficient storage and retrieval of diagnostic information.
Our proposed approach enhances the performance of neural models by effectively training word vectors using routine medical data as well as external knowledge from scientific articles.
arXiv Detail & Related papers (2021-02-26T17:49:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.