Auxiliary Knowledge-Induced Learning for Automatic Multi-Label Medical Document Classification
- URL: http://arxiv.org/abs/2405.19084v1
- Date: Wed, 29 May 2024 13:44:07 GMT
- Title: Auxiliary Knowledge-Induced Learning for Automatic Multi-Label Medical Document Classification
- Authors: Xindi Wang, Robert E. Mercer, Frank Rudzicz,
- Abstract summary: We propose a novel approach for ICD indexing that adopts three ideas.
We use a multi-level deep dilated residual convolution encoder to aggregate the information from the clinical notes.
We formalize the task of ICD classification with auxiliary knowledge of the medical records.
- Score: 22.323705343864336
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The International Classification of Diseases (ICD) is an authoritative medical classification system of different diseases and conditions for clinical and management purposes. ICD indexing assigns a subset of ICD codes to a medical record. Since human coding is labour-intensive and error-prone, many studies employ machine learning to automate the coding process. ICD coding is a challenging task, as it needs to assign multiple codes to each medical document from an extremely large hierarchically organized collection. In this paper, we propose a novel approach for ICD indexing that adopts three ideas: (1) we use a multi-level deep dilated residual convolution encoder to aggregate the information from the clinical notes and learn document representations across different lengths of the texts; (2) we formalize the task of ICD classification with auxiliary knowledge of the medical records, which incorporates not only the clinical texts but also different clinical code terminologies and drug prescriptions for better inferring the ICD codes; and (3) we introduce a graph convolutional network to leverage the co-occurrence patterns among ICD codes, aiming to enhance the quality of label representations. Experimental results show the proposed method achieves state-of-the-art performance on a number of measures.
Related papers
- Multi-stage Retrieve and Re-rank Model for Automatic Medical Coding Recommendation [22.323705343864336]
International Classification of Diseases (ICD) serves as a definitive medical classification system.
The primary objective of ICD indexing is to allocate a subset of ICD codes to a medical record.
Most existing approaches have suffered from selecting the proper label subsets from an extremely large ICD collection.
arXiv Detail & Related papers (2024-05-29T13:54:30Z) - A Novel ICD Coding Method Based on Associated and Hierarchical Code Description Distillation [6.524062529847299]
ICD coding is a challenging multilabel text classification problem due to noisy medical document inputs.
Recent advancements in automated ICD coding have enhanced performance by integrating additional data and knowledge bases with the encoding of medical notes and codes.
We propose a novel framework based on associated and hierarchical code description distillation (AHDD) for better code representation learning and avoidance of improper code assignment.
arXiv Detail & Related papers (2024-04-17T07:26:23Z) - CoRelation: Boosting Automatic ICD Coding Through Contextualized Code
Relation Learning [56.782963838838036]
We propose a novel approach, a contextualized and flexible framework, to enhance the learning of ICD code representations.
Our approach employs a dependent learning paradigm that considers the context of clinical notes in modeling all possible code relations.
arXiv Detail & Related papers (2024-02-24T03:25:28Z) - A Two-Stage Decoder for Efficient ICD Coding [10.634394331433322]
We propose a two-stage decoding mechanism to predict ICD codes.
At first, we predict the parent code and then predict the child code based on the previous prediction.
Experiments on the public MIMIC-III data set show that our model performs well in single-model settings.
arXiv Detail & Related papers (2023-05-27T17:25:13Z) - An Automatic ICD Coding Network Using Partition-Based Label Attention [2.371982686172067]
We propose a novel neural network architecture composed of two parts of encoders and two kinds of label attention layers.
The input text is segmentally encoded in the former encoder and integrated by the follower.
Our results show that our network improves the ICD coding performance based on the partition-based mechanism.
arXiv Detail & Related papers (2022-11-15T07:11:01Z) - ICDBigBird: A Contextual Embedding Model for ICD Code Classification [71.58299917476195]
Contextual word embedding models have achieved state-of-the-art results in multiple NLP tasks.
ICDBigBird is a BigBird-based model which can integrate a Graph Convolutional Network (GCN)
Our experiments on a real-world clinical dataset demonstrate the effectiveness of our BigBird-based model on the ICD classification task.
arXiv Detail & Related papers (2022-04-21T20:59:56Z) - Self-supervised Answer Retrieval on Clinical Notes [68.87777592015402]
We introduce CAPR, a rule-based self-supervision objective for training Transformer language models for domain-specific passage matching.
We apply our objective in four Transformer-based architectures: Contextual Document Vectors, Bi-, Poly- and Cross-encoders.
We report that CAPR outperforms strong baselines in the retrieval of domain-specific passages and effectively generalizes across rule-based and human-labeled passages.
arXiv Detail & Related papers (2021-08-02T10:42:52Z) - A Meta-embedding-based Ensemble Approach for ICD Coding Prediction [64.42386426730695]
International Classification of Diseases (ICD) are the de facto codes used globally for clinical coding.
These codes enable healthcare providers to claim reimbursement and facilitate efficient storage and retrieval of diagnostic information.
Our proposed approach enhances the performance of neural models by effectively training word vectors using routine medical data as well as external knowledge from scientific articles.
arXiv Detail & Related papers (2021-02-26T17:49:58Z) - Inheritance-guided Hierarchical Assignment for Clinical Automatic
Diagnosis [50.15205065710629]
Clinical diagnosis, which aims to assign diagnosis codes for a patient based on the clinical note, plays an essential role in clinical decision-making.
We propose a novel framework to combine the inheritance-guided hierarchical assignment and co-occurrence graph propagation for clinical automatic diagnosis.
arXiv Detail & Related papers (2021-01-27T13:16:51Z) - A Label Attention Model for ICD Coding from Clinical Text [14.910833190248319]
We propose a new label attention model for automatic ICD coding.
It can handle both the various lengths and the interdependence of the ICD code related text fragments.
Our model achieves new state-of-the-art results on three benchmark MIMIC datasets.
arXiv Detail & Related papers (2020-07-13T12:42:43Z) - Learning Contextualized Document Representations for Healthcare Answer
Retrieval [68.02029435111193]
Contextual Discourse Vectors (CDV) is a distributed document representation for efficient answer retrieval from long documents.
Our model leverages a dual encoder architecture with hierarchical LSTM layers and multi-task training to encode the position of clinical entities and aspects alongside the document discourse.
We show that our generalized model significantly outperforms several state-of-the-art baselines for healthcare passage ranking.
arXiv Detail & Related papers (2020-02-03T15:47:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.