Narrative Feature or Structured Feature? A Study of Large Language Models to Identify Cancer Patients at Risk of Heart Failure
- URL: http://arxiv.org/abs/2403.11425v1
- Date: Mon, 18 Mar 2024 02:42:01 GMT
- Title: Narrative Feature or Structured Feature? A Study of Large Language Models to Identify Cancer Patients at Risk of Heart Failure
- Authors: Ziyi Chen, Mengyuan Zhang, Mustafa Mohammed Ahmed, Yi Guo, Thomas J. George, Jiang Bian, Yonghui Wu,
- Abstract summary: This study examined machine learning models to identify cancer patients at risk of heart failure.
We identified a cancer cohort of 12,806 patients from the University of Florida Health, diagnosed with lung, breast, and colorectal cancers.
The proposed narrative features remarkably increased feature density and improved performance.
- Score: 21.660602700862714
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cancer treatments are known to introduce cardiotoxicity, negatively impacting outcomes and survivorship. Identifying cancer patients at risk of heart failure (HF) is critical to improving cancer treatment outcomes and safety. This study examined machine learning (ML) models to identify cancer patients at risk of HF using electronic health records (EHRs), including traditional ML, Time-Aware long short-term memory (T-LSTM), and large language models (LLMs) using novel narrative features derived from the structured medical codes. We identified a cancer cohort of 12,806 patients from the University of Florida Health, diagnosed with lung, breast, and colorectal cancers, among which 1,602 individuals developed HF after cancer. The LLM, GatorTron-3.9B, achieved the best F1 scores, outperforming the traditional support vector machines by 39%, the T-LSTM deep learning model by 7%, and a widely used transformer model, BERT, by 5.6%. The analysis shows that the proposed narrative features remarkably increased feature density and improved performance.
Related papers
- CancerLLM: A Large Language Model in Cancer Domain [19.384643526294127]
CancerLLM is a model with 7 billion parameters and a Mistral-style architecture, pre-trained on 2,676,642 clinical notes and 515,524 pathology reports covering 17 cancer types.
Our evaluation demonstrated that CancerLLM achieves state-of-the-art results compared to other existing LLMs, with an average F1 score improvement of 8.1%.
arXiv Detail & Related papers (2024-06-15T01:02:48Z) - A Large Language Model Pipeline for Breast Cancer Oncology [0.0]
State-of-the-art OpenAI models were fine-tuned on a clinical dataset and clinical guidelines text corpus for two important cancer treatment factors.
A high accuracy (0.85+) was achieved in the classification of adjuvant radiation therapy and chemotherapy for breast cancer patients.
arXiv Detail & Related papers (2024-06-10T16:44:48Z) - Boosting Medical Image-based Cancer Detection via Text-guided Supervision from Reports [68.39938936308023]
We propose a novel text-guided learning method to achieve highly accurate cancer detection results.
Our approach can leverage clinical knowledge by large-scale pre-trained VLM to enhance generalization ability.
arXiv Detail & Related papers (2024-05-23T07:03:38Z) - Improving Breast Cancer Grade Prediction with Multiparametric MRI Created Using Optimized Synthetic Correlated Diffusion Imaging [71.91773485443125]
Grading plays a vital role in breast cancer treatment planning.
The current tumor grading method involves extracting tissue from patients, leading to stress, discomfort, and high medical costs.
This paper examines using optimized CDI$s$ to improve breast cancer grade prediction.
arXiv Detail & Related papers (2024-05-13T15:48:26Z) - EXACT-Net:EHR-guided lung tumor auto-segmentation for non-small cell
lung cancer radiotherapy [8.380581495879433]
Over 60% of non-small cell lung cancer (NSCLC) patients require radiation therapy.
Our approach resulted in a 250% boost in successful nodule detection using the data from ten NSCLC patients treated in our institution.
arXiv Detail & Related papers (2024-02-21T19:49:12Z) - Pulmonologists-Level lung cancer detection based on standard blood test
results and smoking status using an explainable machine learning approach [2.545682175108217]
Lung cancer (LC) remains the primary cause of cancer-related mortality, largely due to late-stage diagnoses.
In recent years, machine learning has demonstrated considerable potential in healthcare by facilitating the detection of various diseases.
We developed an ML model based on dynamic ensemble selection (DES) for LC detection.
arXiv Detail & Related papers (2024-02-14T22:00:57Z) - CIMIL-CRC: a clinically-informed multiple instance learning framework
for patient-level colorectal cancer molecular subtypes classification from
H\&E stained images [45.32169712547367]
We introduce CIMIL-CRC', a framework that solves the MSI/MSS MIL problem by efficiently combining a pre-trained feature extraction model with principal component analysis (PCA) to aggregate information from all patches.
We assessed our CIMIL-CRC method using the average area under the curve (AUC) from a 5-fold cross-validation experimental setup for model development on the TCGA-CRC-DX cohort.
arXiv Detail & Related papers (2024-01-29T12:56:11Z) - Cancer-Net BCa-S: Breast Cancer Grade Prediction using Volumetric Deep
Radiomic Features from Synthetic Correlated Diffusion Imaging [82.74877848011798]
The prevalence of breast cancer continues to grow, affecting about 300,000 females in the United States in 2023.
The gold-standard Scarff-Bloom-Richardson (SBR) grade has been shown to consistently indicate a patient's response to chemotherapy.
In this paper, we study the efficacy of deep learning for breast cancer grading based on synthetic correlated diffusion (CDI$s$) imaging.
arXiv Detail & Related papers (2023-04-12T15:08:34Z) - CancerUniT: Towards a Single Unified Model for Effective Detection,
Segmentation, and Diagnosis of Eight Major Cancers Using a Large Collection
of CT Scans [45.83431075462771]
Human readers or radiologists routinely perform full-body multi-organ multi-disease detection and diagnosis in clinical practice.
Most medical AI systems are built to focus on single organs with a narrow list of a few diseases.
CancerUniT is a query-based Mask Transformer model with the output of multi-tumor prediction.
arXiv Detail & Related papers (2023-01-28T20:09:34Z) - Enhancing Clinical Support for Breast Cancer with Deep Learning Models
using Synthetic Correlated Diffusion Imaging [66.63200823918429]
We investigate enhancing clinical support for breast cancer with deep learning models.
We leverage a volumetric convolutional neural network to learn deep radiomic features from a pre-treatment cohort.
We find that the proposed approach can achieve better performance for both grade and post-treatment response prediction.
arXiv Detail & Related papers (2022-11-10T03:02:12Z) - Machine Learning-based Lung and Colon Cancer Detection using Deep
Feature Extraction and Ensemble Learning [0.9786690381850355]
We introduce a hybrid ensemble feature extraction model to efficiently identify lung and colon cancer.
It integrates deep feature extraction and ensemble learning with high-performance filtering for cancer image datasets.
Our model can detect lung, colon, and (lung and colon) cancer with accuracy rates of 99.05%, 100%, and 99.30%, respectively.
arXiv Detail & Related papers (2022-06-02T15:14:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.