Narrative Feature or Structured Feature? A Study of Large Language Models to Identify Cancer Patients at Risk of Heart Failure
- URL: http://arxiv.org/abs/2403.11425v3
- Date: Sat, 02 Nov 2024 21:12:03 GMT
- Title: Narrative Feature or Structured Feature? A Study of Large Language Models to Identify Cancer Patients at Risk of Heart Failure
- Authors: Ziyi Chen, Mengyuan Zhang, Mustafa Mohammed Ahmed, Yi Guo, Thomas J. George, Jiang Bian, Yonghui Wu,
- Abstract summary: This study examined machine learning models to identify cancer patients at risk of heart failure.
We identified a cancer cohort of 12,806 patients from the University of Florida Health, diagnosed with lung, breast, and colorectal cancers.
The proposed narrative features remarkably increased feature density and improved performance.
- Score: 21.660602700862714
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cancer treatments are known to introduce cardiotoxicity, negatively impacting outcomes and survivorship. Identifying cancer patients at risk of heart failure (HF) is critical to improving cancer treatment outcomes and safety. This study examined machine learning (ML) models to identify cancer patients at risk of HF using electronic health records (EHRs), including traditional ML, Time-Aware long short-term memory (T-LSTM), and large language models (LLMs) using novel narrative features derived from the structured medical codes. We identified a cancer cohort of 12,806 patients from the University of Florida Health, diagnosed with lung, breast, and colorectal cancers, among which 1,602 individuals developed HF after cancer. The LLM, GatorTron-3.9B, achieved the best F1 scores, outperforming the traditional support vector machines by 39%, the T-LSTM deep learning model by 7%, and a widely used transformer model, BERT, by 5.6%. The analysis shows that the proposed narrative features remarkably increased feature density and improved performance.
Related papers
- Advanced Lung Nodule Segmentation and Classification for Early Detection of Lung Cancer using SAM and Transfer Learning [0.0]
This study introduces an innovative approach to lung nodule segmentation by utilizing the Segment Anything Model (SAM) combined with transfer learning techniques.
The proposed method leverages Bounding Box prompts and a vision transformer model to enhance segmentation performance, achieving high accuracy, Dice Similarity Coefficient (DSC) and Intersection over Union (IoU) metrics.
The findings demonstrate the proposed model effectiveness in precisely segmenting lung nodules from CT scans, underscoring its potential to advance early detection and improve patient care outcomes in lung cancer diagnosis.
arXiv Detail & Related papers (2024-12-31T18:21:57Z) - CancerLLM: A Large Language Model in Cancer Domain [17.696798724373934]
CancerLLM is a model with 7 billion parameters and a Mistral-style architecture, pre-trained on 2,676,642 clinical notes and 515,524 pathology reports covering 17 cancer types.
Our evaluation demonstrated that CancerLLM achieves state-of-the-art results compared to other existing LLMs, with an average F1 score improvement of 7.61 %.
arXiv Detail & Related papers (2024-06-15T01:02:48Z) - A Large Language Model Pipeline for Breast Cancer Oncology [0.0]
State-of-the-art OpenAI models were fine-tuned on a clinical dataset and clinical guidelines text corpus for two important cancer treatment factors.
A high accuracy (0.85+) was achieved in the classification of adjuvant radiation therapy and chemotherapy for breast cancer patients.
arXiv Detail & Related papers (2024-06-10T16:44:48Z) - Boosting Medical Image-based Cancer Detection via Text-guided Supervision from Reports [68.39938936308023]
We propose a novel text-guided learning method to achieve highly accurate cancer detection results.
Our approach can leverage clinical knowledge by large-scale pre-trained VLM to enhance generalization ability.
arXiv Detail & Related papers (2024-05-23T07:03:38Z) - Improving Breast Cancer Grade Prediction with Multiparametric MRI Created Using Optimized Synthetic Correlated Diffusion Imaging [71.91773485443125]
Grading plays a vital role in breast cancer treatment planning.
The current tumor grading method involves extracting tissue from patients, leading to stress, discomfort, and high medical costs.
This paper examines using optimized CDI$s$ to improve breast cancer grade prediction.
arXiv Detail & Related papers (2024-05-13T15:48:26Z) - EXACT-Net:EHR-guided lung tumor auto-segmentation for non-small cell lung cancer radiotherapy [7.531407604292937]
Over 60% of non-small cell lung cancer (NSCLC) patients require radiation therapy.
Our approach resulted in a 250% boost in successful nodule detection using the data from ten NSCLC patients treated in our institution.
arXiv Detail & Related papers (2024-02-21T19:49:12Z) - Pulmonologists-Level lung cancer detection based on standard blood test
results and smoking status using an explainable machine learning approach [2.545682175108217]
Lung cancer (LC) remains the primary cause of cancer-related mortality, largely due to late-stage diagnoses.
In recent years, machine learning has demonstrated considerable potential in healthcare by facilitating the detection of various diseases.
We developed an ML model based on dynamic ensemble selection (DES) for LC detection.
arXiv Detail & Related papers (2024-02-14T22:00:57Z) - Cancer-Net BCa-S: Breast Cancer Grade Prediction using Volumetric Deep
Radiomic Features from Synthetic Correlated Diffusion Imaging [82.74877848011798]
The prevalence of breast cancer continues to grow, affecting about 300,000 females in the United States in 2023.
The gold-standard Scarff-Bloom-Richardson (SBR) grade has been shown to consistently indicate a patient's response to chemotherapy.
In this paper, we study the efficacy of deep learning for breast cancer grading based on synthetic correlated diffusion (CDI$s$) imaging.
arXiv Detail & Related papers (2023-04-12T15:08:34Z) - CancerUniT: Towards a Single Unified Model for Effective Detection,
Segmentation, and Diagnosis of Eight Major Cancers Using a Large Collection
of CT Scans [45.83431075462771]
Human readers or radiologists routinely perform full-body multi-organ multi-disease detection and diagnosis in clinical practice.
Most medical AI systems are built to focus on single organs with a narrow list of a few diseases.
CancerUniT is a query-based Mask Transformer model with the output of multi-tumor prediction.
arXiv Detail & Related papers (2023-01-28T20:09:34Z) - Improving Precancerous Case Characterization via Transformer-based
Ensemble Learning [31.891340667123124]
The application of natural language processing to cancer pathology reports has been focused on detecting cancer cases.
Improving the characterization of precancerous adenomas assists in developing diagnostic tests for early cancer detection and prevention.
Our results demonstrated the potential of using NLP to leverage real-world health record data to facilitate the development of diagnostic tests for early cancer prevention.
arXiv Detail & Related papers (2022-12-10T00:06:28Z) - Enhancing Clinical Support for Breast Cancer with Deep Learning Models
using Synthetic Correlated Diffusion Imaging [66.63200823918429]
We investigate enhancing clinical support for breast cancer with deep learning models.
We leverage a volumetric convolutional neural network to learn deep radiomic features from a pre-treatment cohort.
We find that the proposed approach can achieve better performance for both grade and post-treatment response prediction.
arXiv Detail & Related papers (2022-11-10T03:02:12Z) - Machine Learning-based Lung and Colon Cancer Detection using Deep
Feature Extraction and Ensemble Learning [0.9786690381850355]
We introduce a hybrid ensemble feature extraction model to efficiently identify lung and colon cancer.
It integrates deep feature extraction and ensemble learning with high-performance filtering for cancer image datasets.
Our model can detect lung, colon, and (lung and colon) cancer with accuracy rates of 99.05%, 100%, and 99.30%, respectively.
arXiv Detail & Related papers (2022-06-02T15:14:41Z) - CAE-Transformer: Transformer-based Model to Predict Invasiveness of Lung
Adenocarcinoma Subsolid Nodules from Non-thin Section 3D CT Scans [36.093580055848186]
Lung Adenocarcinoma (LAUC) has recently been the most prevalent.
Timely and accurate knowledge of the invasiveness of lung nodules leads to a proper treatment plan and reduces the risk of unnecessary or late surgeries.
The primary imaging modality to assess and predict the invasiveness of LAUCs is the chest CT.
In this paper, a predictive transformer-based framework, referred to as the "CAE-Transformer", is developed to classify LAUCs.
arXiv Detail & Related papers (2021-10-17T04:37:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.