Related papers: Hybrid Student-Teacher Large Language Model Refinement for Cancer Toxicity Symptom Extraction

Hybrid Student-Teacher Large Language Model Refinement for Cancer Toxicity Symptom Extraction

URL: http://arxiv.org/abs/2408.04775v1
Date: Thu, 8 Aug 2024 22:18:01 GMT
Title: Hybrid Student-Teacher Large Language Model Refinement for Cancer Toxicity Symptom Extraction
Authors: Reza Khanmohammadi, Ahmed I. Ghanem, Kyle Verdecchia, Ryan Hall, Mohamed Elshaikh, Benjamin Movsas, Hassan Bagher-Ebadian, Bing Luo, Indrin J. Chetty, Tuka Alhanai, Kundan Thind, Mohammad M. Ghassemi,
Abstract summary: Large Language Models (LLMs) offer significant potential for clinical symptom extraction, but their deployment in healthcare settings is constrained by privacy concerns, computational limitations, and operational costs. This study investigates the optimization of compact LLMs for cancer toxicity symptom extraction using a novel iterative refinement approach.
Score: 3.564938069395287
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) offer significant potential for clinical symptom extraction, but their deployment in healthcare settings is constrained by privacy concerns, computational limitations, and operational costs. This study investigates the optimization of compact LLMs for cancer toxicity symptom extraction using a novel iterative refinement approach. We employ a student-teacher architecture, utilizing Zephyr-7b-beta and Phi3-mini-128 as student models and GPT-4o as the teacher, to dynamically select between prompt refinement, Retrieval-Augmented Generation (RAG), and fine-tuning strategies. Our experiments on 294 clinical notes covering 12 post-radiotherapy toxicity symptoms demonstrate the effectiveness of this approach. The RAG method proved most efficient, improving average accuracy scores from 0.32 to 0.73 for Zephyr-7b-beta and from 0.40 to 0.87 for Phi3-mini-128 during refinement. In the test set, both models showed an approximate 0.20 increase in accuracy across symptoms. Notably, this improvement was achieved at a cost 45 times lower than GPT-4o for Zephyr and 79 times lower for Phi-3. These results highlight the potential of iterative refinement techniques in enhancing the capabilities of compact LLMs for clinical applications, offering a balance between performance, cost-effectiveness, and privacy preservation in healthcare settings.

Related papers

Hybrid Ensemble of Segmentation-Assisted Classification and GBDT for Skin Cancer Detection with Engineered Metadata and Synthetic Lesions from ISIC 2024 Non-Dermoscopic 3D-TBP Images [0.0]
This work presents a hybrid machine and deep learning-based approach for classifying skin lesions.<n>It comprises 401,059 cropped lesion images extracted from 3D Total Body Photography (TBP), emulating non-dermoscopic, smartphone-like conditions.<n>Predictions are fused with a gradient-boosted decision tree (GBDT) ensemble enriched by engineered features and patient-specific relational metrics.
arXiv Detail & Related papers (2025-06-03T22:00:03Z)
MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks [47.486705282473984]
Large language models (LLMs) achieve near-perfect scores on medical exams.<n>These evaluations inadequately reflect complexity and diversity of real-world clinical practice.<n>We introduce MedHELM, an evaluation framework for assessing LLM performance for medical tasks.
arXiv Detail & Related papers (2025-05-26T22:55:49Z)
Predicting Length of Stay in Neurological ICU Patients Using Classical Machine Learning and Neural Network Models: A Benchmark Study on MIMIC-IV [49.1574468325115]
This study explores multiple ML approaches for predicting LOS in ICU specifically for the patients with neurological diseases based on the MIMIC-IV dataset.<n>The evaluated models include classic ML algorithms (K-Nearest Neighbors, Random Forest, XGBoost and CatBoost) and Neural Networks (LSTM, BERT and Temporal Fusion Transformer)
arXiv Detail & Related papers (2025-05-23T14:06:42Z)
A Modular Approach for Clinical SLMs Driven by Synthetic Data with Pre-Instruction Tuning, Model Merging, and Clinical-Tasks Alignment [46.776978552161395]
Small language models (SLMs) offer a cost-effective alternative to large language models such as GPT-4.<n>SLMs offer a cost-effective alternative, but their limited capacity requires biomedical domain adaptation.<n>We propose a novel framework for adapting SLMs into high-performing clinical models.
arXiv Detail & Related papers (2025-05-15T21:40:21Z)
Diagnosis of Pulmonary Hypertension by Integrating Multimodal Data with a Hybrid Graph Convolutional and Transformer Network [32.50971951245164]
This study develops and validates a deep learning-based diagnostic model for pulmonary hypertension (PH) It is designed to classify patients as non-PH, pre-capillary PH, or post-capillary PH. It has the potential to support clinical decision-making by effectively integrating multimodal data.
arXiv Detail & Related papers (2025-03-28T01:14:17Z)
ARIES: Stimulating Self-Refinement of Large Language Models by Iterative Preference Optimization [34.77238246296517]
A truly intelligent Large Language Model (LLM) should be capable of correcting errors in its responses through external interactions. We introduce a novel post-training and inference framework, called ARIES: Adaptive Refinement and Iterative Enhancement Structure. ARIES iteratively performs preference training and self-refinement-based data collection.
arXiv Detail & Related papers (2025-02-08T15:21:55Z)
Brain Tumor Classification on MRI in Light of Molecular Markers [61.77272414423481]
Co-deletion of the 1p/19q gene is associated with clinical outcomes in low-grade gliomas. This study aims to utilize a specially MRI-based convolutional neural network for brain cancer detection.
arXiv Detail & Related papers (2024-09-29T07:04:26Z)
Enhanced Prediction of Ventilator-Associated Pneumonia in Patients with Traumatic Brain Injury Using Advanced Machine Learning Techniques [0.0]
Ventilator-associated pneumonia (VAP) in traumatic brain injury (TBI) patients poses a significant mortality risk. Timely detection and prognostication of VAP in TBI patients are crucial to improve patient outcomes and alleviate the strain on healthcare resources. We implemented six machine learning models using the MIMIC-III database.
arXiv Detail & Related papers (2024-08-02T09:44:18Z)
Machine Learning for ALSFRS-R Score Prediction: Making Sense of the Sensor Data [44.99833362998488]
Amyotrophic Lateral Sclerosis (ALS) is a rapidly progressive neurodegenerative disease that presents individuals with limited treatment options. The present investigation, spearheaded by the iDPP@CLEF 2024 challenge, focuses on utilizing sensor-derived data obtained through an app.
arXiv Detail & Related papers (2024-07-10T19:17:23Z)
Leveraging Knowledge Distillation for Lightweight Skin Cancer Classification: Balancing Accuracy and Computational Efficiency [0.0]
Skin cancer is a major concern to public health, accounting for one-third of the reported cancers. We present a knowledge distillation based approach for creating a lightweight yet high-performing classifier. With its high accuracy and compact size, our model appears to be a potential choice for accurate skin cancer classification, particularly in resource-constrained settings.
arXiv Detail & Related papers (2024-06-24T18:13:09Z)
Validation of a new, minimally-invasive, software smartphone device to predict sleep apnea and its severity: transversal study [3.798946451618375]
Obstructive sleep apnea (OSA) is frequent and responsible for cardiovascular complications and excessive daytime sleepiness. Alternative methods using smartphone sensors could be useful to increase diagnosis. This article shows that manual scoring of smartphone-based signals is possible and accurate compared to PSG-based scorings.
arXiv Detail & Related papers (2024-06-20T14:36:15Z)
Iterative Prompt Refinement for Radiation Oncology Symptom Extraction Using Teacher-Student Large Language Models [1.3137489010086167]
Mixtral, the student model, initially extracts symptoms, followed by GPT-4, the teacher model, which refines prompts based on Mixtral's performance. Results showed significant improvements in extracting symptoms from both single and multi-symptom notes.
arXiv Detail & Related papers (2024-02-06T15:25:09Z)
RECOVER: sequential model optimization platform for combination drug repurposing identifies novel synergistic compounds in vitro [46.773794687622825]
We employ a sequential model optimization search applied to a deep learning model to quickly discover highly synergistic drug combinations active against a cancer cell line. We find that the set of combinations queried by our model is enriched for highly synergistic combinations. Remarkably, we rediscovered a synergistic drug combination that was later confirmed to be under study within clinical trials.
arXiv Detail & Related papers (2022-02-07T02:54:29Z)
A Generic Deep Learning Based Cough Analysis System from Clinically Validated Samples for Point-of-Need Covid-19 Test and Severity Levels [85.41238731489939]
We seek to evaluate the detection performance of a rapid primary screening tool of Covid-19 based on the cough sound from 8,380 clinically validated samples. Our proposed generic method is an algorithm based on Empirical Mode Decomposition (EMD) with subsequent classification based on a tensor of audio features. Two different versions of DeepCough based on the number of tensor dimensions, i.e. DeepCough2D and DeepCough3D, have been investigated.
arXiv Detail & Related papers (2021-11-10T19:39:26Z)
Comparison of Machine Learning Classifiers to Predict Patient Survival and Genetics of GBM: Towards a Standardized Model for Clinical Implementation [44.02622933605018]
Radiomic models have been shown to outperform clinical data for outcome prediction in glioblastoma (GBM) We aimed to compare nine machine learning classifiers to predict overall survival (OS), isocitrate dehydrogenase (IDH) mutation, O-6-methylguanine-DNA-methyltransferase (MGMT) promoter methylation, epidermal growth factor receptor (EGFR) VII amplification and Ki-67 expression in GBM patients. xGB obtained maximum accuracy for OS (74.5%), AB for IDH mutation (88%), MGMT methylation (71,7%), Ki-67 expression (86,6%), and EGFR amplification (81,
arXiv Detail & Related papers (2021-02-10T15:10:37Z)
Attention-Based LSTM Network for COVID-19 Clinical Trial Parsing [0.0]
We train attention-based bidirectional Long Short-Term Memory (Att-BiLSTM) models and use the optimal model to extract entities from the eligibility criteria of COVID-19 trials. We compare the performance of Att-BiLSTM with traditional ontology-based method. Our analyses demonstrate that Att-BiLSTM is an effective approach for characterizing patient populations in COVID-19 clinical trials.
arXiv Detail & Related papers (2020-12-18T05:55:52Z)
COVID-MTL: Multitask Learning with Shift3D and Random-weighted Loss for Automated Diagnosis and Severity Assessment of COVID-19 [39.57518533765393]
There is an urgent need for automated methods to assist accurate and effective assessment of COVID-19. We present an end-to-end multitask learning framework (COVID-MTL) that is capable of automated and simultaneous detection (against both radiology and NAT) and severity assessment of COVID-19.
arXiv Detail & Related papers (2020-12-10T08:30:46Z)
CovidDeep: SARS-CoV-2/COVID-19 Test Based on Wearable Medical Sensors and Efficient Neural Networks [51.589769497681175]
The novel coronavirus (SARS-CoV-2) has led to a pandemic. The current testing regime based on Reverse Transcription-Polymerase Chain Reaction for SARS-CoV-2 has been unable to keep up with testing demands. We propose a framework called CovidDeep that combines efficient DNNs with commercially available WMSs for pervasive testing of the virus.
arXiv Detail & Related papers (2020-07-20T21:47:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.