KDH-MLTC: Knowledge Distillation for Healthcare Multi-Label Text Classification
- URL: http://arxiv.org/abs/2505.07162v1
- Date: Mon, 12 May 2025 00:58:25 GMT
- Title: KDH-MLTC: Knowledge Distillation for Healthcare Multi-Label Text Classification
- Authors: Hajar Sakai, Sarah S. Lam,
- Abstract summary: This research presents Knowledge Distillation for Healthcare Multi-Label Text Classification (KDH-MLTC)<n>The proposed approach addresses conventional healthcare Multi-Label Text Classification challenges by integrating knowledge distillation and sequential fine-tuning.<n>Experiments conducted on three medical literature datasets demonstrate that KDH-MLTC achieves superior performance compared to existing approaches.
- Score: 4.8342038441006805
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The increasing volume of healthcare textual data requires computationally efficient, yet highly accurate classification approaches able to handle the nuanced and complex nature of medical terminology. This research presents Knowledge Distillation for Healthcare Multi-Label Text Classification (KDH-MLTC), a framework leveraging model compression and Large Language Models (LLMs). The proposed approach addresses conventional healthcare Multi-Label Text Classification (MLTC) challenges by integrating knowledge distillation and sequential fine-tuning, subsequently optimized through Particle Swarm Optimization (PSO) for hyperparameter tuning. KDH-MLTC transfers knowledge from a more complex teacher LLM (i.e., BERT) to a lighter student LLM (i.e., DistilBERT) through sequential training adapted to MLTC that preserves the teacher's learned information while significantly reducing computational requirements. As a result, the classification is enabled to be conducted locally, making it suitable for healthcare textual data characterized by sensitivity and, therefore, ensuring HIPAA compliance. The experiments conducted on three medical literature datasets of different sizes, sampled from the Hallmark of Cancer (HoC) dataset, demonstrate that KDH-MLTC achieves superior performance compared to existing approaches, particularly for the largest dataset, reaching an F1 score of 82.70%. Additionally, statistical validation and an ablation study are carried out, proving the robustness of KDH-MLTC. Furthermore, the PSO-based hyperparameter optimization process allowed the identification of optimal configurations. The proposed approach contributes to healthcare text classification research, balancing efficiency requirements in resource-constrained healthcare settings with satisfactory accuracy demands.
Related papers
- Clinical NLP with Attention-Based Deep Learning for Multi-Disease Prediction [44.0876796031468]
This paper addresses the challenges posed by the unstructured nature and high-dimensional semantic complexity of electronic health record texts.<n>A deep learning method based on attention mechanisms is proposed to achieve unified modeling for information extraction and multi-label disease prediction.
arXiv Detail & Related papers (2025-07-02T07:45:22Z) - Lightweight Clinical Decision Support System using QLoRA-Fine-Tuned LLMs and Retrieval-Augmented Generation [0.0]
This research paper investigates the application of Large Language Models (LLMs) in healthcare.<n>We focus on enhancing medical decision support through Retrieval-Augmented Generation (RAG) integrated with hospital-specific data and fine-tuning using Quantized Low-Rank Adaptation (QLoRA)<n>We touch on the ethical considerations-patient privacy, data security, and the need for rigorous clinical validation-as well as the practical challenges of integrating such systems into real-world healthcare.
arXiv Detail & Related papers (2025-05-06T10:31:54Z) - Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval [61.70489848327436]
KARE is a novel framework that integrates knowledge graph (KG) community-level retrieval with large language models (LLMs) reasoning.<n>Extensive experiments demonstrate that KARE outperforms leading models by up to 10.8-15.0% on MIMIC-III and 12.6-12.7% on MIMIC-IV for mortality and readmission predictions.
arXiv Detail & Related papers (2024-10-06T18:46:28Z) - Improving Extraction of Clinical Event Contextual Properties from Electronic Health Records: A Comparative Study [2.0884301753594334]
This study performs a comparative analysis of various natural language models for medical text classification.
BERT outperforms Bi-LSTM models by up to 28% and the baseline BERT model by up to 16% for recall of the minority classes.
arXiv Detail & Related papers (2024-08-30T10:28:49Z) - Communication-Efficient Hybrid Federated Learning for E-health with Horizontal and Vertical Data Partitioning [67.49221252724229]
E-health allows smart devices and medical institutions to collaboratively collect patients' data, which is trained by Artificial Intelligence (AI) technologies to help doctors make diagnosis.
Applying federated learning in e-health faces many challenges.
Medical data is both horizontally and vertically partitioned.
A naive combination of HFL and VFL has limitations including low training efficiency, unsound convergence analysis, and lack of parameter tuning strategies.
arXiv Detail & Related papers (2024-04-15T19:45:07Z) - Developing Healthcare Language Model Embedding Spaces [0.20971479389679337]
Pre-trained Large Language Models (LLMs) often struggle on out-of-domain datasets like healthcare focused text.
Three methods are assessed: traditional masked language modeling, Deep Contrastive Learning for Unsupervised Textual Representations (DeCLUTR) and a novel pre-training objective utilizing metadata categories from the healthcare settings.
Contrastively trained models outperform other approaches on the classification tasks, delivering strong performance from limited labeled data and with fewer model parameter updates required.
arXiv Detail & Related papers (2024-03-28T19:31:32Z) - Learnable Weight Initialization for Volumetric Medical Image Segmentation [66.3030435676252]
We propose a learnable weight-based hybrid medical image segmentation approach.
Our approach is easy to integrate into any hybrid model and requires no external training data.
Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-15T17:55:05Z) - Large Language Models for Healthcare Data Augmentation: An Example on
Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM)
Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z) - Large Language Models for Biomedical Knowledge Graph Construction:
Information extraction from EMR notes [0.0]
We propose an end-to-end machine learning solution based on large language models (LLMs)
The entities used in the KG construction process are diseases, factors, treatments, as well as manifestations that coexist with the patient while experiencing the disease.
The application of the proposed methodology is demonstrated on age-related macular degeneration.
arXiv Detail & Related papers (2023-01-29T15:52:33Z) - Auto-FedRL: Federated Hyperparameter Optimization for
Multi-institutional Medical Image Segmentation [48.821062916381685]
Federated learning (FL) is a distributed machine learning technique that enables collaborative model training while avoiding explicit data sharing.
In this work, we propose an efficient reinforcement learning(RL)-based federated hyperparameter optimization algorithm, termed Auto-FedRL.
The effectiveness of the proposed method is validated on a heterogeneous data split of the CIFAR-10 dataset and two real-world medical image segmentation datasets.
arXiv Detail & Related papers (2022-03-12T04:11:42Z) - BoostMIS: Boosting Medical Image Semi-supervised Learning with Adaptive
Pseudo Labeling and Informative Active Annotation [39.9910035951912]
We propose a novel semi-supervised learning (SSL) framework named BoostMIS.
It combines adaptive pseudo labeling and informative active annotation to unleash the potential of medical image SSL models.
arXiv Detail & Related papers (2022-03-04T19:19:41Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.