LTR-ICD: A Learning-to-Rank Approach for Automatic ICD Coding
- URL: http://arxiv.org/abs/2510.13922v1
- Date: Wed, 15 Oct 2025 11:46:42 GMT
- Title: LTR-ICD: A Learning-to-Rank Approach for Automatic ICD Coding
- Authors: Mohammad Mansoori, Amira Soliman, Farzaneh Etminani,
- Abstract summary: Clinical notes contain unstructured text provided by clinicians during patient encounters.<n>These notes are usually accompanied by a sequence of diagnostic codes following the International Classification of Diseases (ICD)<n>State-of-the-art methods treated this problem as a classification task, leading to ignoring the order of ICD codes that is essential for different purposes.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Clinical notes contain unstructured text provided by clinicians during patient encounters. These notes are usually accompanied by a sequence of diagnostic codes following the International Classification of Diseases (ICD). Correctly assigning and ordering ICD codes are essential for medical diagnosis and reimbursement. However, automating this task remains challenging. State-of-the-art methods treated this problem as a classification task, leading to ignoring the order of ICD codes that is essential for different purposes. In this work, as a first attempt, we approach this task from a retrieval system perspective to consider the order of codes, thus formulating this problem as a classification and ranking task. Our results and analysis show that the proposed framework has a superior ability to identify high-priority codes compared to other methods. For instance, our model accuracy in correctly ranking primary diagnosis codes is 47%, compared to 20% for the state-of-the-art classifier. Additionally, in terms of classification metrics, the proposed model achieves a micro- and macro-F1 scores of 0.6065 and 0.2904, respectively, surpassing the previous best model with scores of 0.597 and 0.2660.
Related papers
- Pathological Prior-Guided Multiple Instance Learning For Mitigating Catastrophic Forgetting in Breast Cancer Whole Slide Image Classification [50.899861205016265]
We propose a new framework PaGMIL to mitigate catastrophic forgetting in breast cancer WSI classification.<n>Our framework introduces two key components into the common MIL model architecture.<n>We evaluate the continual learning performance of PaGMIL across several public breast cancer datasets.
arXiv Detail & Related papers (2025-03-08T04:51:58Z) - Multi-stage Retrieve and Re-rank Model for Automatic Medical Coding Recommendation [22.323705343864336]
International Classification of Diseases (ICD) serves as a definitive medical classification system.
The primary objective of ICD indexing is to allocate a subset of ICD codes to a medical record.
Most existing approaches have suffered from selecting the proper label subsets from an extremely large ICD collection.
arXiv Detail & Related papers (2024-05-29T13:54:30Z) - Auxiliary Knowledge-Induced Learning for Automatic Multi-Label Medical Document Classification [22.323705343864336]
We propose a novel approach for ICD indexing that adopts three ideas.
We use a multi-level deep dilated residual convolution encoder to aggregate the information from the clinical notes.
We formalize the task of ICD classification with auxiliary knowledge of the medical records.
arXiv Detail & Related papers (2024-05-29T13:44:07Z) - CoRelation: Boosting Automatic ICD Coding Through Contextualized Code
Relation Learning [56.782963838838036]
We propose a novel approach, a contextualized and flexible framework, to enhance the learning of ICD code representations.
Our approach employs a dependent learning paradigm that considers the context of clinical notes in modeling all possible code relations.
arXiv Detail & Related papers (2024-02-24T03:25:28Z) - Automated clinical coding using off-the-shelf large language models [10.365958121087305]
The task of assigning diagnostic ICD codes to patient hospital admissions is typically performed by expert human coders.
Efforts towards automated ICD coding are dominated by supervised deep learning models.
In this work, we leverage off-the-shelf pre-trained generative large language models to develop a practical solution.
arXiv Detail & Related papers (2023-10-10T11:56:48Z) - Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review
and Replicability Study [60.56194508762205]
We reproduce, compare, and analyze state-of-the-art automated medical coding machine learning models.
We show that several models underperform due to weak configurations, poorly sampled train-test splits, and insufficient evaluation.
We present the first comprehensive results on the newly released MIMIC-IV dataset using the reproduced models.
arXiv Detail & Related papers (2023-04-21T11:54:44Z) - ICDBigBird: A Contextual Embedding Model for ICD Code Classification [71.58299917476195]
Contextual word embedding models have achieved state-of-the-art results in multiple NLP tasks.
ICDBigBird is a BigBird-based model which can integrate a Graph Convolutional Network (GCN)
Our experiments on a real-world clinical dataset demonstrate the effectiveness of our BigBird-based model on the ICD classification task.
arXiv Detail & Related papers (2022-04-21T20:59:56Z) - Medical Code Prediction from Discharge Summary: Document to Sequence
BERT using Sequence Attention [0.0]
We propose a model based on bidirectional encoder representations from transformer (BERT) using the sequence attention method for automatic ICD code assignment.
We evaluate our ap-proach on the MIMIC-III benchmark dataset.
arXiv Detail & Related papers (2021-06-15T07:35:50Z) - TransICD: Transformer Based Code-wise Attention Model for Explainable
ICD Coding [5.273190477622007]
International Classification of Disease (ICD) coding procedure has been shown to be effective and crucial to the billing system in medical sector.
Currently, ICD codes are assigned to a clinical note manually which is likely to cause many errors.
In this project, we apply a transformer-based architecture to capture the interdependence among the tokens of a document and then use a code-wise attention mechanism to learn code-specific representations of the entire document.
arXiv Detail & Related papers (2021-03-28T05:34:32Z) - A Meta-embedding-based Ensemble Approach for ICD Coding Prediction [64.42386426730695]
International Classification of Diseases (ICD) are the de facto codes used globally for clinical coding.
These codes enable healthcare providers to claim reimbursement and facilitate efficient storage and retrieval of diagnostic information.
Our proposed approach enhances the performance of neural models by effectively training word vectors using routine medical data as well as external knowledge from scientific articles.
arXiv Detail & Related papers (2021-02-26T17:49:58Z) - From Extreme Multi-label to Multi-class: A Hierarchical Approach for
Automated ICD-10 Coding Using Phrase-level Attention [4.387302129801651]
Clinical coding is the task of assigning a set of alphanumeric codes, referred to as ICD (International Classification of Diseases), to a medical event based on the context captured in a clinical narrative.
We propose a novel approach for automatic ICD coding by reformulating the extreme multi-label problem into a simpler multi-class problem using a hierarchical solution.
arXiv Detail & Related papers (2021-02-18T03:19:14Z) - Inheritance-guided Hierarchical Assignment for Clinical Automatic
Diagnosis [50.15205065710629]
Clinical diagnosis, which aims to assign diagnosis codes for a patient based on the clinical note, plays an essential role in clinical decision-making.
We propose a novel framework to combine the inheritance-guided hierarchical assignment and co-occurrence graph propagation for clinical automatic diagnosis.
arXiv Detail & Related papers (2021-01-27T13:16:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.