Seeing The Whole Patient: Using Multi-Label Medical Text Classification
Techniques to Enhance Predictions of Medical Codes
- URL: http://arxiv.org/abs/2004.00430v1
- Date: Sun, 29 Mar 2020 02:19:30 GMT
- Title: Seeing The Whole Patient: Using Multi-Label Medical Text Classification
Techniques to Enhance Predictions of Medical Codes
- Authors: Vithya Yogarajan, Jacob Montiel, Tony Smith, Bernhard Pfahringer
- Abstract summary: We present results of multi-label medical text classification problems with 18, 50 and 155 labels.
For imbalanced data we show that labels which occur infrequently, benefit the most from additional features incorporated in embeddings.
High dimensional embeddings from this research are made available for public use.
- Score: 2.158285012874102
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning-based multi-label medical text classifications can be used
to enhance the understanding of the human body and aid the need for patient
care. We present a broad study on clinical natural language processing
techniques to maximise a feature representing text when predicting medical
codes on patients with multi-morbidity. We present results of multi-label
medical text classification problems with 18, 50 and 155 labels. We compare
several variations to embeddings, text tagging, and pre-processing. For
imbalanced data we show that labels which occur infrequently, benefit the most
from additional features incorporated in embeddings. We also show that high
dimensional embeddings pre-trained using health-related data present a
significant improvement in a multi-label setting, similarly to the way they
improve performance for binary classification. High dimensional embeddings from
this research are made available for public use.
Related papers
- DILA: Dictionary Label Attention for Mechanistic Interpretability in High-dimensional Multi-label Medical Coding Prediction [27.778160315671776]
Predicting high-dimensional or extreme multilabels, such as in medical coding, requires both accuracy and interpretability.
We propose a mechanistic interpretability module that disentangles uninterpretable dense embeddings into a sparse embedding space.
We show that our sparse embeddings are more human understandable than its dense counterparts by at least 50 percent.
arXiv Detail & Related papers (2024-09-16T17:45:40Z) - MLIP: Enhancing Medical Visual Representation with Divergence Encoder
and Knowledge-guided Contrastive Learning [48.97640824497327]
We propose a novel framework leveraging domain-specific medical knowledge as guiding signals to integrate language information into the visual domain through image-text contrastive learning.
Our model includes global contrastive learning with our designed divergence encoder, local token-knowledge-patch alignment contrastive learning, and knowledge-guided category-level contrastive learning with expert knowledge.
Notably, MLIP surpasses state-of-the-art methods even with limited annotated data, highlighting the potential of multimodal pre-training in advancing medical representation learning.
arXiv Detail & Related papers (2024-02-03T05:48:50Z) - Making the Most Out of the Limited Context Length: Predictive Power
Varies with Clinical Note Type and Note Section [70.37720062263176]
We propose a framework to analyze the sections with high predictive power.
Using MIMIC-III, we show that: 1) predictive power distribution is different between nursing notes and discharge notes and 2) combining different types of notes could improve performance when the context length is large.
arXiv Detail & Related papers (2023-07-13T20:04:05Z) - Effective Medical Code Prediction via Label Internal Alignment [2.538209532048867]
We propose a multi-view attention based Neural network to predict medical codes from clinical texts.
Our method is verified to be effective on the open source dataset.
arXiv Detail & Related papers (2023-05-09T04:14:20Z) - Towards more patient friendly clinical notes through language models and
ontologies [57.51898902864543]
We present a novel approach to automated medical text based on word simplification and language modelling.
We use a new dataset pairs of publicly available medical sentences and a version of them simplified by clinicians.
Our method based on a language model trained on medical forum data generates simpler sentences while preserving both grammar and the original meaning.
arXiv Detail & Related papers (2021-12-23T16:11:19Z) - Improving Predictions of Tail-end Labels using Concatenated
BioMed-Transformers for Long Medical Documents [3.0625089376654664]
This research aims to improve F1 scores of infrequent labels across multi-label problems, especially with long-tail labels.
New state-of-the-art (SOTA) results are obtained using TransformerXL for predicting medical codes.
arXiv Detail & Related papers (2021-12-03T05:06:43Z) - Word-level Text Highlighting of Medical Texts forTelehealth Services [0.0]
This paper aims to show how different text highlighting techniques can capture relevant medical context.
Three different word-level text highlighting methodologies are implemented and evaluated.
The results of our experiments show that the neural network approach is successful in highlighting medically-relevant terms.
arXiv Detail & Related papers (2021-05-21T15:13:54Z) - Does the Magic of BERT Apply to Medical Code Assignment? A Quantitative
Study [2.871614744079523]
It is not clear if pretrained models are useful for medical code prediction without further architecture engineering.
We propose a hierarchical fine-tuning architecture to capture interactions between distant words and adopt label-wise attention to exploit label information.
Contrary to current trends, we demonstrate that a carefully trained classical CNN outperforms attention-based models on a MIMIC-III subset with frequent codes.
arXiv Detail & Related papers (2021-03-11T07:23:45Z) - A Meta-embedding-based Ensemble Approach for ICD Coding Prediction [64.42386426730695]
International Classification of Diseases (ICD) are the de facto codes used globally for clinical coding.
These codes enable healthcare providers to claim reimbursement and facilitate efficient storage and retrieval of diagnostic information.
Our proposed approach enhances the performance of neural models by effectively training word vectors using routine medical data as well as external knowledge from scientific articles.
arXiv Detail & Related papers (2021-02-26T17:49:58Z) - Interaction Matching for Long-Tail Multi-Label Classification [57.262792333593644]
We present an elegant and effective approach for addressing limitations in existing multi-label classification models.
By performing soft n-gram interaction matching, we match labels with natural language descriptions.
arXiv Detail & Related papers (2020-05-18T15:27:55Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.