The Power of Combining Data and Knowledge: GPT-4o is an Effective Interpreter of Machine Learning Models in Predicting Lymph Node Metastasis of Lung Cancer
- URL: http://arxiv.org/abs/2407.17900v5
- Date: Thu, 15 Aug 2024 02:33:22 GMT
- Title: The Power of Combining Data and Knowledge: GPT-4o is an Effective Interpreter of Machine Learning Models in Predicting Lymph Node Metastasis of Lung Cancer
- Authors: Danqing Hu, Bing Liu, Xiaofeng Zhu, Nan Wu,
- Abstract summary: Lymph node metastasis (LNM) is a crucial factor in determining the initial treatment for patients with lung cancer.
Recently, large language models (LLMs) have garnered significant attention due to their remarkable text generation capabilities.
We propose a novel ensemble method that combines the medical knowledge acquired by LLMs with the latent patterns identified by machine learning models.
- Score: 18.32753287825974
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Lymph node metastasis (LNM) is a crucial factor in determining the initial treatment for patients with lung cancer, yet accurate preoperative diagnosis of LNM remains challenging. Recently, large language models (LLMs) have garnered significant attention due to their remarkable text generation capabilities. Leveraging the extensive medical knowledge learned from vast corpora, LLMs can estimate probabilities for clinical problems, though their performance has historically been inferior to data-driven machine learning models. In this paper, we propose a novel ensemble method that combines the medical knowledge acquired by LLMs with the latent patterns identified by machine learning models to enhance LNM prediction performance. Initially, we developed machine learning models using patient data. We then designed a prompt template to integrate the patient data with the predicted probability from the machine learning model. Subsequently, we instructed GPT-4o, the most advanced LLM developed by OpenAI, to estimate the likelihood of LNM based on patient data and then adjust the estimate using the machine learning output. Finally, we collected three outputs from the GPT-4o using the same prompt and ensembled these results as the final prediction. Using the proposed method, our models achieved an AUC value of 0.778 and an AP value of 0.426 for LNM prediction, significantly improving predictive performance compared to baseline machine learning models. The experimental results indicate that GPT-4o can effectively leverage its medical knowledge and the probabilities predicted by machine learning models to achieve more accurate LNM predictions. These findings demonstrate that LLMs can perform well in clinical risk prediction tasks, offering a new paradigm for integrating medical knowledge and patient data in clinical predictions.
Related papers
- Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval [61.70489848327436]
KARE is a novel framework that integrates knowledge graph (KG) community-level retrieval with large language models (LLMs) reasoning.
Extensive experiments demonstrate that KARE outperforms leading models by up to 10.8-15.0% on MIMIC-III and 12.6-12.7% on MIMIC-IV for mortality and readmission predictions.
arXiv Detail & Related papers (2024-10-06T18:46:28Z) - Predicting Lung Cancer Patient Prognosis with Large Language Models [20.97970447748789]
Large language models (LLMs) have gained attention for their ability to process and generate text based on extensive learned knowledge.
We evaluate the potential of GPT-4o mini and GPT-3.5 in predicting the prognosis of lung cancer patients.
arXiv Detail & Related papers (2024-08-15T06:36:27Z) - Large Language Model Distilling Medication Recommendation Model [61.89754499292561]
We harness the powerful semantic comprehension and input-agnostic characteristics of Large Language Models (LLMs)
Our research aims to transform existing medication recommendation methodologies using LLMs.
To mitigate this, we have developed a feature-level knowledge distillation technique, which transfers the LLM's proficiency to a more compact model.
arXiv Detail & Related papers (2024-02-05T08:25:22Z) - CPLLM: Clinical Prediction with Large Language Models [0.07083082555458872]
We present a method that involves fine-tuning a pre-trained Large Language Model (LLM) for clinical disease and readmission prediction.
For diagnosis prediction, we predict whether patients will be diagnosed with a target disease during their next visit or in the subsequent diagnosis, leveraging their historical diagnosis records.
Our experiments have shown that our proposed method, CPLLM, surpasses all the tested models in terms of PR-AUC and ROC-AUC metrics.
arXiv Detail & Related papers (2023-09-20T13:24:12Z) - Textual Data Augmentation for Patient Outcomes Prediction [67.72545656557858]
We propose a novel data augmentation method to generate artificial clinical notes in patients' Electronic Health Records.
We fine-tune the generative language model GPT-2 to synthesize labeled text with the original training data.
We evaluate our method on the most common patient outcome, i.e., the 30-day readmission rate.
arXiv Detail & Related papers (2022-11-13T01:07:23Z) - Multi-task fusion for improving mammography screening data
classification [3.7683182861690843]
We propose a pipeline approach, where we first train a set of individual, task-specific models.
We then investigate the fusion thereof, which is in contrast to the standard model ensembling strategy.
Our fusion approaches improve AUC scores significantly by up to 0.04 compared to standard model ensembling.
arXiv Detail & Related papers (2021-12-01T13:56:27Z) - Survival Prediction of Heart Failure Patients using Stacked Ensemble
Machine Learning Algorithm [0.0]
Heart failure is one of the major health hazard issues of our time and is a leading cause of death worldwide.
Data mining is the process of converting massive volumes of raw data created by the healthcare institutions into meaningful information.
Our study shows that only certain attributes collected from the patients are imperative to successfully predict the surviving possibility post heart failure.
arXiv Detail & Related papers (2021-08-30T16:42:27Z) - A multi-stage machine learning model on diagnosis of esophageal
manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage.
This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - Transfer Learning without Knowing: Reprogramming Black-box Machine
Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model.
Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses.
BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z) - Language Models Are An Effective Patient Representation Learning
Technique For Electronic Health Record Data [7.260199064831896]
We show that patient representation schemes inspired from techniques in natural language processing can increase the accuracy of clinical prediction models.
Such patient representation schemes enable a 3.5% mean improvement in AUROC on five prediction tasks compared to standard baselines.
arXiv Detail & Related papers (2020-01-06T22:24:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.