Improving ICD coding using Chapter based Named Entities and Attentional Models
- URL: http://arxiv.org/abs/2407.17230v1
- Date: Wed, 24 Jul 2024 12:34:23 GMT
- Title: Improving ICD coding using Chapter based Named Entities and Attentional Models
- Authors: Abhijith R. Beeravolu, Mirjam Jonkman, Sami Azam, Friso De Boer,
- Abstract summary: We introduce an enhanced approach to ICD coding that improves F1 scores by using chapter-based named entities and attentional models.
This method categorizes discharge summaries into ICD-9 Chapters and develops attentional models with chapter-specific data.
For categorization, we use Chapter-IV to de-bias and influence key entities and weights without neural networks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in natural language processing (NLP) have led to automation in various domains. However, clinical NLP often relies on benchmark datasets that may not reflect real-world scenarios accurately. Automatic ICD coding, a vital NLP task, typically uses outdated and imbalanced datasets like MIMIC-III, with existing methods yielding micro-averaged F1 scores between 0.4 and 0.7 due to many false positives. Our research introduces an enhanced approach to ICD coding that improves F1 scores by using chapter-based named entities and attentional models. This method categorizes discharge summaries into ICD-9 Chapters and develops attentional models with chapter-specific data, eliminating the need to consider external data for code identification. For categorization, we use Chapter-IV to de-bias and influence key entities and weights without neural networks, creating accurate thresholds and providing interpretability for human validation. Post-validation, we develop attentional models for three frequent and three non-frequent codes from Chapter-IV using Bidirectional-Gated Recurrent Units (GRUs) with Attention and Transformer with Multi-head Attention architectures. The average Micro-F1 scores of 0.79 and 0.81 from these models demonstrate significant performance improvements in ICD coding.
Related papers
- Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models [84.8919069953397]
Self-TAught Recognizer (STAR) is an unsupervised adaptation framework for speech recognition systems.
We show that STAR achieves an average of 13.5% relative reduction in word error rate across 14 target domains.
STAR exhibits high data efficiency that only requires less than one-hour unlabeled data.
arXiv Detail & Related papers (2024-05-23T04:27:11Z) - In-Context Language Learning: Architectures and Algorithms [73.93205821154605]
We study ICL through the lens of a new family of model problems we term in context language learning (ICLL)
We evaluate a diverse set of neural sequence models on regular ICLL tasks.
arXiv Detail & Related papers (2024-01-23T18:59:21Z) - Predicting Infant Brain Connectivity with Federated Multi-Trajectory
GNNs using Scarce Data [54.55126643084341]
Existing deep learning solutions suffer from three major limitations.
We introduce FedGmTE-Net++, a federated graph-based multi-trajectory evolution network.
Using the power of federation, we aggregate local learnings among diverse hospitals with limited datasets.
arXiv Detail & Related papers (2024-01-01T10:20:01Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt [7.554528566861559]
This study transforms this multi-label classification task into an autoregressive generation task.
Instead of directly predicting the high dimensional space of ICD codes, our model generates the lower dimension of text descriptions.
Experiments on MIMIC-III-few show that our model performs with a marco F1 30.2, which substantially outperforms the previous MIMIC-III-full SOTA model.
arXiv Detail & Related papers (2022-11-24T22:10:50Z) - Hierarchical Label-wise Attention Transformer Model for Explainable ICD
Coding [10.387366211090734]
We propose a hierarchical label-wise attention Transformer model (HiLAT) for the explainable prediction of ICD codes from clinical documents.
We evaluate HiLAT using hospital discharge summaries and their corresponding ICD-9 codes from the MIMIC-III database.
Visualisations of attention weights present a potential explainability tool for checking the face validity of ICD code predictions.
arXiv Detail & Related papers (2022-04-22T14:12:22Z) - Secondary Use of Clinical Problem List Entries for Neural Network-Based
Disease Code Assignment [1.3190581566723918]
We explore automated coding of 50 character long clinical problem list entries using the International Classification of Diseases (ICD-10)
A fastText baseline reached a macro-averaged F1-score of 0.83, followed by a character-level LSTM with a macro-averaged F1-score of 0.84.
A neural network activation analysis together with an investigation of the false positives and false negatives unveiled inconsistent manual coding as a main limiting factor.
arXiv Detail & Related papers (2021-12-27T16:11:05Z) - TransICD: Transformer Based Code-wise Attention Model for Explainable
ICD Coding [5.273190477622007]
International Classification of Disease (ICD) coding procedure has been shown to be effective and crucial to the billing system in medical sector.
Currently, ICD codes are assigned to a clinical note manually which is likely to cause many errors.
In this project, we apply a transformer-based architecture to capture the interdependence among the tokens of a document and then use a code-wise attention mechanism to learn code-specific representations of the entire document.
arXiv Detail & Related papers (2021-03-28T05:34:32Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - Fader Networks for domain adaptation on fMRI: ABIDE-II study [68.5481471934606]
We use 3D convolutional autoencoders to build the domain irrelevant latent space image representation and demonstrate this method to outperform existing approaches on ABIDE data.
arXiv Detail & Related papers (2020-10-14T16:50:50Z) - Predicting Multiple ICD-10 Codes from Brazilian-Portuguese Clinical
Notes [4.971638713979981]
We develop and optimize a Logistic Regression model, a Convolutional Neural Network (CNN), a Gated Recurrent Unit Neural Network and a CNN with Attention for prediction of diagnosis ICD codes.
Compared to MIMIC-III, the Brazilian Portuguese dataset contains far fewer words per document, when only discharge summaries are used.
The CNN-Att model achieves the best results on both datasets, with micro-averaged F1 score of 0.537 on MIMIC-III and 0.485 on our dataset with additional documents.
arXiv Detail & Related papers (2020-07-29T22:12:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.