A Two-Stage Decoder for Efficient ICD Coding
- URL: http://arxiv.org/abs/2306.00005v1
- Date: Sat, 27 May 2023 17:25:13 GMT
- Title: A Two-Stage Decoder for Efficient ICD Coding
- Authors: Thanh-Tung Nguyen, Viktor Schlegel, Abhinav Kashyap, Stefan Winkler
- Abstract summary: We propose a two-stage decoding mechanism to predict ICD codes.
At first, we predict the parent code and then predict the child code based on the previous prediction.
Experiments on the public MIMIC-III data set show that our model performs well in single-model settings.
- Score: 10.634394331433322
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Clinical notes in healthcare facilities are tagged with the International
Classification of Diseases (ICD) code; a list of classification codes for
medical diagnoses and procedures. ICD coding is a challenging multilabel text
classification problem due to noisy clinical document inputs and long-tailed
label distribution. Recent automated ICD coding efforts improve performance by
encoding medical notes and codes with additional data and knowledge bases.
However, most of them do not reflect how human coders generate the code: first,
the coders select general code categories and then look for specific
subcategories that are relevant to a patient's condition. Inspired by this, we
propose a two-stage decoding mechanism to predict ICD codes. Our model uses the
hierarchical properties of the codes to split the prediction into two steps: At
first, we predict the parent code and then predict the child code based on the
previous prediction. Experiments on the public MIMIC-III data set show that our
model performs well in single-model settings without external data or
knowledge.
Related papers
- Prototypical Hash Encoding for On-the-Fly Fine-Grained Category Discovery [65.16724941038052]
Category-aware Prototype Generation (CPG) and Discrimi Category 5.3% (DCE) are proposed.
CPG enables the model to fully capture the intra-category diversity by representing each category with multiple prototypes.
DCE boosts the discrimination ability of hash code with the guidance of the generated category prototypes.
arXiv Detail & Related papers (2024-10-24T23:51:40Z) - Auxiliary Knowledge-Induced Learning for Automatic Multi-Label Medical Document Classification [22.323705343864336]
We propose a novel approach for ICD indexing that adopts three ideas.
We use a multi-level deep dilated residual convolution encoder to aggregate the information from the clinical notes.
We formalize the task of ICD classification with auxiliary knowledge of the medical records.
arXiv Detail & Related papers (2024-05-29T13:44:07Z) - A Novel ICD Coding Method Based on Associated and Hierarchical Code Description Distillation [6.524062529847299]
ICD coding is a challenging multilabel text classification problem due to noisy medical document inputs.
Recent advancements in automated ICD coding have enhanced performance by integrating additional data and knowledge bases with the encoding of medical notes and codes.
We propose a novel framework based on associated and hierarchical code description distillation (AHDD) for better code representation learning and avoidance of improper code assignment.
arXiv Detail & Related papers (2024-04-17T07:26:23Z) - Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review
and Replicability Study [60.56194508762205]
We reproduce, compare, and analyze state-of-the-art automated medical coding machine learning models.
We show that several models underperform due to weak configurations, poorly sampled train-test splits, and insufficient evaluation.
We present the first comprehensive results on the newly released MIMIC-IV dataset using the reproduced models.
arXiv Detail & Related papers (2023-04-21T11:54:44Z) - HieNet: Bidirectional Hierarchy Framework for Automated ICD Coding [2.9373912230684573]
International Classification of Diseases (ICD) is a set of classification codes for medical records.
In this work, we proposed a novel Bidirectional Hierarchy Framework(HieNet) to address the challenges.
Specifically, a personalized PageRank routine is developed to capture the co-relation of codes, a bidirectional hierarchy passage encoder to capture the codes' hierarchical representations, and a progressive predicting method is then proposed to narrow down the semantic searching space of prediction.
arXiv Detail & Related papers (2022-12-09T14:51:12Z) - CodeExp: Explanatory Code Document Generation [94.43677536210465]
Existing code-to-text generation models produce only high-level summaries of code.
We conduct a human study to identify the criteria for high-quality explanatory docstring for code.
We present a multi-stage fine-tuning strategy and baseline models for the task.
arXiv Detail & Related papers (2022-11-25T18:05:44Z) - Can Current Explainability Help Provide References in Clinical Notes to
Support Humans Annotate Medical Codes? [53.45585591262433]
We present an explainable Read, Attend, and Code (xRAC) framework and assess two approaches, attention score-based xRAC-ATTN and model-agnostic knowledge-distillation-based xRAC-KD.
We find that the supporting evidence text highlighted by xRAC-ATTN is of higher quality than xRAC-KD whereas xRAC-KD has potential advantages in production deployment scenarios.
arXiv Detail & Related papers (2022-10-28T04:06:07Z) - Few-Shot Electronic Health Record Coding through Graph Contrastive
Learning [64.8138823920883]
We seek to improve the performance for both frequent and rare ICD codes by using a contrastive graph-based EHR coding framework, CoGraph.
CoGraph learns similarities and dissimilarities between HEWE graphs from different ICD codes so that information can be transferred among them.
Two graph contrastive learning schemes, GSCL and GECL, exploit the HEWE graph structures so as to encode transferable features.
arXiv Detail & Related papers (2021-06-29T14:53:17Z) - TransICD: Transformer Based Code-wise Attention Model for Explainable
ICD Coding [5.273190477622007]
International Classification of Disease (ICD) coding procedure has been shown to be effective and crucial to the billing system in medical sector.
Currently, ICD codes are assigned to a clinical note manually which is likely to cause many errors.
In this project, we apply a transformer-based architecture to capture the interdependence among the tokens of a document and then use a code-wise attention mechanism to learn code-specific representations of the entire document.
arXiv Detail & Related papers (2021-03-28T05:34:32Z) - A Meta-embedding-based Ensemble Approach for ICD Coding Prediction [64.42386426730695]
International Classification of Diseases (ICD) are the de facto codes used globally for clinical coding.
These codes enable healthcare providers to claim reimbursement and facilitate efficient storage and retrieval of diagnostic information.
Our proposed approach enhances the performance of neural models by effectively training word vectors using routine medical data as well as external knowledge from scientific articles.
arXiv Detail & Related papers (2021-02-26T17:49:58Z) - From Extreme Multi-label to Multi-class: A Hierarchical Approach for
Automated ICD-10 Coding Using Phrase-level Attention [4.387302129801651]
Clinical coding is the task of assigning a set of alphanumeric codes, referred to as ICD (International Classification of Diseases), to a medical event based on the context captured in a clinical narrative.
We propose a novel approach for automatic ICD coding by reformulating the extreme multi-label problem into a simpler multi-class problem using a hierarchical solution.
arXiv Detail & Related papers (2021-02-18T03:19:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.