Large Language Models for Granularized Barrett's Esophagus Diagnosis
Classification
- URL: http://arxiv.org/abs/2308.08660v1
- Date: Wed, 16 Aug 2023 20:17:46 GMT
- Title: Large Language Models for Granularized Barrett's Esophagus Diagnosis
Classification
- Authors: Jenna Kefeli, Ali Soroush, Courtney J. Diamond, Haley M. Zylberberg,
Benjamin May, Julian A. Abrams, Chunhua Weng, Nicholas Tatonetti
- Abstract summary: Laborious manual chart review is required to extract key diagnostic phenotypes from pathology reports.
We developed a generalizable transformer-based method to automate data extraction.
Binary dysplasia extraction achieves 0.964 F1-score, while the multi-class model achieves 0.911 F1-score.
- Score: 4.144525746262123
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Diagnostic codes for Barrett's esophagus (BE), a precursor to esophageal
cancer, lack granularity and precision for many research or clinical use cases.
Laborious manual chart review is required to extract key diagnostic phenotypes
from BE pathology reports. We developed a generalizable transformer-based
method to automate data extraction. Using pathology reports from Columbia
University Irving Medical Center with gastroenterologist-annotated targets, we
performed binary dysplasia classification as well as granularized multi-class
BE-related diagnosis classification. We utilized two clinically pre-trained
large language models, with best model performance comparable to a highly
tailored rule-based system developed using the same data. Binary dysplasia
extraction achieves 0.964 F1-score, while the multi-class model achieves 0.911
F1-score. Our method is generalizable and faster to implement as compared to a
tailored rule-based approach.
Related papers
- Large Language Models-Enabled Digital Twins for Precision Medicine in Rare Gynecological Tumors [0.7550821077310732]
Rare gynecological tumors (RGTs) present major clinical challenges.
The lack of clear guidelines leads to suboptimal management and poor prognosis.
This study explores the use of large language models (LLMs) to construct digital twins for precision medicine in RGTs.
arXiv Detail & Related papers (2024-08-31T21:14:09Z) - DDxT: Deep Generative Transformer Models for Differential Diagnosis [51.25660111437394]
We show that a generative approach trained with simpler supervised and self-supervised learning signals can achieve superior results on the current benchmark.
The proposed Transformer-based generative network, named DDxT, autoregressively produces a set of possible pathologies, i.e., DDx, and predicts the actual pathology using a neural network.
arXiv Detail & Related papers (2023-12-02T22:57:25Z) - Hierarchical Classification System for Breast Cancer Specimen Report
(HCSBC) -- an end-to-end model for characterizing severity and diagnosis [3.4454444815042735]
We develop a hierarchical hybrid transformer-based pipeline (59 labels) - Hierarchical Classification System for Breast Cancer Specimen Report (HCSBC)
We trained the model on the EUH data and evaluated our model's performance on two external datasets - MGH and Mayo Clinic.
arXiv Detail & Related papers (2023-11-02T18:37:45Z) - Graph-Ensemble Learning Model for Multi-label Skin Lesion Classification
using Dermoscopy and Clinical Images [7.159532626507458]
This study introduces a Graph Convolution Network (GCN) to exploit prior co-occurrence between each category as a correlation matrix into the deep learning model for the multi-label classification.
We propose a Graph-Ensemble Learning Model (GELN) that views the prediction from GCN as complementary information of the predictions from the fusion model.
arXiv Detail & Related papers (2023-07-04T13:19:57Z) - Improving Classification Model Performance on Chest X-Rays through Lung
Segmentation [63.45024974079371]
We propose a deep learning approach to enhance abnormal chest x-ray (CXR) identification performance through segmentations.
Our approach is designed in a cascaded manner and incorporates two modules: a deep neural network with criss-cross attention modules (XLSor) for localizing lung region in CXR images and a CXR classification model with a backbone of a self-supervised momentum contrast (MoCo) model pre-trained on large-scale CXR data sets.
arXiv Detail & Related papers (2022-02-22T15:24:06Z) - A multi-stage machine learning model on diagnosis of esophageal
manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage.
This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Detecting ulcerative colitis from colon samples using efficient feature
selection and machine learning [1.5484595752241122]
Ulcerative colitis (UC) is one of the most common forms of inflammatory bowel disease (IBD) characterized by inflammation of the mucosal layer of the colon.
We created a model to discriminate between healthy subjects and subjects with UC based on the expression values of 32 genes in colon samples.
Our model perfectly detected all active cases and had an average precision of 0.62 in the inactive cases.
arXiv Detail & Related papers (2020-08-04T14:56:45Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.