Evaluating GPT's Capability in Identifying Stages of Cognitive Impairment from Electronic Health Data
- URL: http://arxiv.org/abs/2502.09715v1
- Date: Thu, 13 Feb 2025 19:04:47 GMT
- Title: Evaluating GPT's Capability in Identifying Stages of Cognitive Impairment from Electronic Health Data
- Authors: Yu Leng, Yingnan He, Colin Magdamo, Ana-Maria Vranceanu, Christine S. Ritchie, Shibani S. Mukerji, Lidia M. V. R. Moura, John R. Dickson, Deborah Blacker, Sudeshna Das,
- Abstract summary: This study evaluates an automated approach using zero-shot GPT-4o to determine stage of cognitive impairment in two different tasks.
We evaluated GPT-4o's ability to determine the global Clinical Dementia Rating (CDR) on specialist notes from 769 patients.
Second, we assessed GPT-4o's ability to differentiate between normal cognition, mild cognitive impairment (MCI), and dementia on all notes in a 3-year window from 860 Medicare patients.
- Score: 0.8777457069049611
- License:
- Abstract: Identifying cognitive impairment within electronic health records (EHRs) is crucial not only for timely diagnoses but also for facilitating research. Information about cognitive impairment often exists within unstructured clinician notes in EHRs, but manual chart reviews are both time-consuming and error-prone. To address this issue, our study evaluates an automated approach using zero-shot GPT-4o to determine stage of cognitive impairment in two different tasks. First, we evaluated the ability of GPT-4o to determine the global Clinical Dementia Rating (CDR) on specialist notes from 769 patients who visited the memory clinic at Massachusetts General Hospital (MGH), and achieved a weighted kappa score of 0.83. Second, we assessed GPT-4o's ability to differentiate between normal cognition, mild cognitive impairment (MCI), and dementia on all notes in a 3-year window from 860 Medicare patients. GPT-4o attained a weighted kappa score of 0.91 in comparison to specialist chart reviews and 0.96 on cases that the clinical adjudicators rated with high confidence. Our findings demonstrate GPT-4o's potential as a scalable chart review tool for creating research datasets and assisting diagnosis in clinical settings in the future.
Related papers
- Explainable and externally validated machine learning for neuropsychiatric diagnosis via electrocardiograms [0.8108972030676012]
Electrocardiogram (ECG) analysis has emerged as a promising tool for identifying physiological changes associated with neuropsychiatric conditions.
The potential of the ECG to accurately distinguish neuropsychiatric conditions, particularly among diverse patient populations, remains underexplored.
This study utilized ECG markers and basic demographic data to predict neuropsychiatric conditions using machine learning models.
arXiv Detail & Related papers (2025-02-07T13:37:13Z) - GPT-4 on Clinic Depression Assessment: An LLM-Based Pilot Study [0.6999740786886538]
We explore the use of GPT-4 for clinical depression assessment based on transcript analysis.
We examine the model's ability to classify patient interviews into binary categories: depressed and not depressed.
Results indicate that GPT-4 exhibits considerable variability in accuracy and F1-Score across configurations.
arXiv Detail & Related papers (2024-12-31T00:32:43Z) - Advancing Mental Health Pre-Screening: A New Custom GPT for Psychological Distress Assessment [0.8287206589886881]
'Psycho Analyst' is a custom GPT model based on OpenAI's GPT-4, optimized for pre-screening mental health disorders.
The model adeptly decodes nuanced linguistic indicators of mental health disorders.
arXiv Detail & Related papers (2024-08-03T00:38:30Z) - A Systematic Evaluation of GPT-4V's Multimodal Capability for Medical
Image Analysis [87.25494411021066]
GPT-4V's multimodal capability for medical image analysis is evaluated.
It is found that GPT-4V excels in understanding medical images and generates high-quality radiology reports.
It is found that its performance for medical visual grounding needs to be substantially improved.
arXiv Detail & Related papers (2023-10-31T11:39:09Z) - Is GPT-4 a reliable rater? Evaluating Consistency in GPT-4 Text Ratings [63.35165397320137]
This study investigates the consistency of feedback ratings generated by OpenAI's GPT-4.
The model rated responses to tasks within the Higher Education subject domain of macroeconomics in terms of their content and style.
arXiv Detail & Related papers (2023-08-03T12:47:17Z) - The Potential and Pitfalls of using a Large Language Model such as
ChatGPT or GPT-4 as a Clinical Assistant [12.017491902296836]
ChatGPT and GPT-4 have demonstrated promising performance on several medical domain tasks.
We performed two analyses using ChatGPT and GPT-4, one to identify patients with specific medical diagnoses using a real-world large electronic health record database.
For patient assessment, GPT-4 can accurately diagnose three out of four times.
arXiv Detail & Related papers (2023-07-16T21:19:47Z) - Learning to diagnose cirrhosis from radiological and histological labels
with joint self and weakly-supervised pretraining strategies [62.840338941861134]
We propose to leverage transfer learning from large datasets annotated by radiologists, to predict the histological score available on a small annex dataset.
We compare different pretraining methods, namely weakly-supervised and self-supervised ones, to improve the prediction of the cirrhosis.
This method outperforms the baseline classification of the METAVIR score, reaching an AUC of 0.84 and a balanced accuracy of 0.75.
arXiv Detail & Related papers (2023-02-16T17:06:23Z) - Human Evaluation and Correlation with Automatic Metrics in Consultation
Note Generation [56.25869366777579]
In recent years, machine learning models have rapidly become better at generating clinical consultation notes.
We present an extensive human evaluation study where 5 clinicians listen to 57 mock consultations, write their own notes, post-edit a number of automatically generated notes, and extract all the errors.
We find that a simple, character-based Levenshtein distance metric performs on par if not better than common model-based metrics like BertScore.
arXiv Detail & Related papers (2022-04-01T14:04:16Z) - Using Deep Learning to Identify Patients with Cognitive Impairment in
Electronic Health Records [0.0]
Only one in four people who suffer from dementia are diagnosed.
Dementia is under-diagnosed by healthcare professionals.
Deep learning NLP can successfully identify dementia patients without dementia-related ICD codes or medications.
arXiv Detail & Related papers (2021-11-13T01:44:10Z) - Identification of Ischemic Heart Disease by using machine learning
technique based on parameters measuring Heart Rate Variability [50.591267188664666]
In this study, 18 non-invasive features (age, gender, left ventricular ejection fraction and 15 obtained from HRV) of 243 subjects were used to train and validate a series of several ANN.
The best result was obtained using 7 input parameters and 7 hidden nodes with an accuracy of 98.9% and 82% for the training and validation dataset.
arXiv Detail & Related papers (2020-10-29T19:14:41Z) - Multilabel 12-Lead Electrocardiogram Classification Using Gradient
Boosting Tree Ensemble [64.29529357862955]
We build an algorithm using gradient boosted tree ensembles fitted on morphology and signal processing features to classify ECG diagnosis.
For each lead, we derive features from heart rate variability, PQRST template shape, and the full signal waveform.
We join the features of all 12 leads to fit an ensemble of gradient boosting decision trees to predict probabilities of ECG instances belonging to each class.
arXiv Detail & Related papers (2020-10-21T18:11:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.