Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt
- URL: http://arxiv.org/abs/2211.13813v1
- Date: Thu, 24 Nov 2022 22:10:50 GMT
- Title: Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt
- Authors: Zhichao Yang, Sunjae Kwon, Zonghai Yao, Hong Yu
- Abstract summary: This study transforms this multi-label classification task into an autoregressive generation task.
Instead of directly predicting the high dimensional space of ICD codes, our model generates the lower dimension of text descriptions.
Experiments on MIMIC-III-few show that our model performs with a marco F1 30.2, which substantially outperforms the previous MIMIC-III-full SOTA model.
- Score: 7.554528566861559
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Automatic International Classification of Diseases (ICD) coding aims to
assign multiple ICD codes to a medical note with an average of 3,000+ tokens.
This task is challenging due to the high-dimensional space of multi-label
assignment (155,000+ ICD code candidates) and the long-tail challenge - Many
ICD codes are infrequently assigned yet infrequent ICD codes are important
clinically. This study addresses the long-tail challenge by transforming this
multi-label classification task into an autoregressive generation task.
Specifically, we first introduce a novel pretraining objective to generate free
text diagnoses and procedure using the SOAP structure, the medical logic
physicians use for note documentation. Second, instead of directly predicting
the high dimensional space of ICD codes, our model generates the lower
dimension of text descriptions, which then infer ICD codes. Third, we designed
a novel prompt template for multi-label classification. We evaluate our
Generation with Prompt model with the benchmark of all code assignment
(MIMIC-III-full) and few shot ICD code assignment evaluation benchmark
(MIMIC-III-few). Experiments on MIMIC-III-few show that our model performs with
a marco F1 30.2, which substantially outperforms the previous MIMIC-III-full
SOTA model (marco F1 4.3) and the model specifically designed for few/zero shot
setting (marco F1 18.7). Finally, we design a novel ensemble learner, a cross
attention reranker with prompts, to integrate previous SOTA and our best
few-shot coding predictions. Experiments on MIMIC-III-full show that our
ensemble learner substantially improves both macro and micro F1, from 10.4 to
14.6 and from 58.2 to 59.1, respectively.
Related papers
- CoRelation: Boosting Automatic ICD Coding Through Contextualized Code
Relation Learning [56.782963838838036]
We propose a novel approach, a contextualized and flexible framework, to enhance the learning of ICD code representations.
Our approach employs a dependent learning paradigm that considers the context of clinical notes in modeling all possible code relations.
arXiv Detail & Related papers (2024-02-24T03:25:28Z) - Can GPT-3.5 Generate and Code Discharge Summaries? [45.633849969788315]
We generated and coded 9,606 discharge summaries based on lists of ICD-10 code descriptions.
Neural coding models were trained on baseline and augmented data.
We report micro- and macro-F1 scores on the full codeset, generation codes, and their families.
arXiv Detail & Related papers (2024-01-24T15:10:13Z) - Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review
and Replicability Study [60.56194508762205]
We reproduce, compare, and analyze state-of-the-art automated medical coding machine learning models.
We show that several models underperform due to weak configurations, poorly sampled train-test splits, and insufficient evaluation.
We present the first comprehensive results on the newly released MIMIC-IV dataset using the reproduced models.
arXiv Detail & Related papers (2023-04-21T11:54:44Z) - Knowledge Injected Prompt Based Fine-tuning for Multi-label Few-shot ICD
Coding [7.8183215844641]
This study addresses the long-tail challenge by adapting a prompt-based fine-tuning technique with label semantics.
Experiments on MIMIC-III-full, a benchmark dataset of code assignment, show that our proposed method outperforms previous state-of-art method in 14.5% in marco F1.
Our model improves marco F1 from 17.1 to 30.4 and micro F1 from 17.2 to 32.6 compared to previous method.
arXiv Detail & Related papers (2022-10-07T03:25:58Z) - Hierarchical Label-wise Attention Transformer Model for Explainable ICD
Coding [10.387366211090734]
We propose a hierarchical label-wise attention Transformer model (HiLAT) for the explainable prediction of ICD codes from clinical documents.
We evaluate HiLAT using hospital discharge summaries and their corresponding ICD-9 codes from the MIMIC-III database.
Visualisations of attention weights present a potential explainability tool for checking the face validity of ICD code predictions.
arXiv Detail & Related papers (2022-04-22T14:12:22Z) - ICDBigBird: A Contextual Embedding Model for ICD Code Classification [71.58299917476195]
Contextual word embedding models have achieved state-of-the-art results in multiple NLP tasks.
ICDBigBird is a BigBird-based model which can integrate a Graph Convolutional Network (GCN)
Our experiments on a real-world clinical dataset demonstrate the effectiveness of our BigBird-based model on the ICD classification task.
arXiv Detail & Related papers (2022-04-21T20:59:56Z) - CoPHE: A Count-Preserving Hierarchical Evaluation Metric in Large-Scale
Multi-Label Text Classification [70.554573538777]
We argue for hierarchical evaluation of the predictions of neural LMTC models.
We describe a structural issue in the representation of the structured label space in prior art.
We propose a set of metrics for hierarchical evaluation using the depth-based representation.
arXiv Detail & Related papers (2021-09-10T13:09:12Z) - Medical Code Prediction from Discharge Summary: Document to Sequence
BERT using Sequence Attention [0.0]
We propose a model based on bidirectional encoder representations from transformer (BERT) using the sequence attention method for automatic ICD code assignment.
We evaluate our ap-proach on the MIMIC-III benchmark dataset.
arXiv Detail & Related papers (2021-06-15T07:35:50Z) - From Extreme Multi-label to Multi-class: A Hierarchical Approach for
Automated ICD-10 Coding Using Phrase-level Attention [4.387302129801651]
Clinical coding is the task of assigning a set of alphanumeric codes, referred to as ICD (International Classification of Diseases), to a medical event based on the context captured in a clinical narrative.
We propose a novel approach for automatic ICD coding by reformulating the extreme multi-label problem into a simpler multi-class problem using a hierarchical solution.
arXiv Detail & Related papers (2021-02-18T03:19:14Z) - A Label Attention Model for ICD Coding from Clinical Text [14.910833190248319]
We propose a new label attention model for automatic ICD coding.
It can handle both the various lengths and the interdependence of the ICD code related text fragments.
Our model achieves new state-of-the-art results on three benchmark MIMIC datasets.
arXiv Detail & Related papers (2020-07-13T12:42:43Z) - Students Need More Attention: BERT-based AttentionModel for Small Data
with Application to AutomaticPatient Message Triage [65.7062363323781]
We propose a novel framework based on BioBERT (Bidirectional Representations from Transformers forBiomedical TextMining)
We introduce Label Embeddings for Self-Attention in each layer of BERT, which we call LESA-BERT, and (ii) by distilling LESA-BERT to smaller variants, we aim to reduce overfitting and model size when working on small datasets.
As an application, our framework is utilized to build a model for patient portal message triage that classifies the urgency of a message into three categories: non-urgent, medium and urgent.
arXiv Detail & Related papers (2020-06-22T03:39:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.