ALPHA: AnomaLous Physiological Health Assessment Using Large Language
Models
- URL: http://arxiv.org/abs/2311.12524v1
- Date: Tue, 21 Nov 2023 11:09:57 GMT
- Title: ALPHA: AnomaLous Physiological Health Assessment Using Large Language
Models
- Authors: Jiankai Tang, Kegang Wang, Hongming Hu, Xiyuxing Zhang, Peiyu Wang,
Xin Liu, Yuntao Wang
- Abstract summary: Large Language Models (LLMs) exhibit exceptional performance in determining medical indicators.
Our specially adapted GPT models demonstrated remarkable proficiency, achieving less than 1 bpm error in cycle count.
This study highlights LLMs' dual role as health data analysis tools and pivotal elements in advanced AI health assistants.
- Score: 4.247764575421617
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study concentrates on evaluating the efficacy of Large Language Models
(LLMs) in healthcare, with a specific focus on their application in personal
anomalous health monitoring. Our research primarily investigates the
capabilities of LLMs in interpreting and analyzing physiological data obtained
from FDA-approved devices. We conducted an extensive analysis using anomalous
physiological data gathered in a simulated low-air-pressure plateau
environment. This allowed us to assess the precision and reliability of LLMs in
understanding and evaluating users' health status with notable specificity. Our
findings reveal that LLMs exhibit exceptional performance in determining
medical indicators, including a Mean Absolute Error (MAE) of less than 1 beat
per minute for heart rate and less than 1% for oxygen saturation (SpO2).
Furthermore, the Mean Absolute Percentage Error (MAPE) for these evaluations
remained below 1%, with the overall accuracy of health assessments surpassing
85%. In image analysis tasks, such as interpreting photoplethysmography (PPG)
data, our specially adapted GPT models demonstrated remarkable proficiency,
achieving less than 1 bpm error in cycle count and 7.28 MAE for heart rate
estimation. This study highlights LLMs' dual role as health data analysis tools
and pivotal elements in advanced AI health assistants, offering personalized
health insights and recommendations within the future health assistant
framework.
Related papers
- Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval [61.70489848327436]
KARE is a novel framework that integrates knowledge graph (KG) community-level retrieval with large language models (LLMs) reasoning.
Extensive experiments demonstrate that KARE outperforms leading models by up to 10.8-15.0% on MIMIC-III and 12.6-12.7% on MIMIC-IV for mortality and readmission predictions.
arXiv Detail & Related papers (2024-10-06T18:46:28Z) - SemioLLM: Assessing Large Language Models for Semiological Analysis in Epilepsy Research [45.2233252981348]
Large Language Models have shown promising results in their ability to encode general medical knowledge.
We test the ability of state-of-the-art LLMs to leverage their internal knowledge and reasoning for epilepsy diagnosis.
arXiv Detail & Related papers (2024-07-03T11:02:12Z) - Large Language Models for Cuffless Blood Pressure Measurement From Wearable Biosignals [14.216163316714285]
There is a notable gap in the utilization of large language models (LLMs) for analyzing wearable biosignals to achieve cuffless blood pressure (BP) measurement.
This paper presents the first work to explore the capacity of LLMs to perform cuffless BP estimation based on wearable biosignals.
To evaluate the proposed approach, we conducted assessments of ten advanced LLMs using a comprehensive public dataset of wearable biosignals from 1,272 participants.
arXiv Detail & Related papers (2024-06-26T04:54:45Z) - Evaluating the Fairness of the MIMIC-IV Dataset and a Baseline
Algorithm: Application to the ICU Length of Stay Prediction [65.268245109828]
This paper uses the MIMIC-IV dataset to examine the fairness and bias in an XGBoost binary classification model predicting the ICU length of stay.
The research reveals class imbalances in the dataset across demographic attributes and employs data preprocessing and feature extraction.
The paper concludes with recommendations for fairness-aware machine learning techniques for mitigating biases and the need for collaborative efforts among healthcare professionals and data scientists.
arXiv Detail & Related papers (2023-12-31T16:01:48Z) - Mixed-Integer Projections for Automated Data Correction of EMRs Improve
Predictions of Sepsis among Hospitalized Patients [7.639610349097473]
We introduce an innovative projections-based method that seamlessly integrates clinical expertise as domain constraints.
We measure the distance of corrected data from the constraints defining a healthy range of patient data, resulting in a unique predictive metric we term as "trust-scores"
We show an AUROC of 0.865 and a precision of 0.922, that surpasses conventional ML models without such projections.
arXiv Detail & Related papers (2023-08-21T15:14:49Z) - Self-Verification Improves Few-Shot Clinical Information Extraction [73.6905567014859]
Large language models (LLMs) have shown the potential to accelerate clinical curation via few-shot in-context learning.
They still struggle with issues regarding accuracy and interpretability, especially in mission-critical domains such as health.
Here, we explore a general mitigation framework using self-verification, which leverages the LLM to provide provenance for its own extraction and check its own outputs.
arXiv Detail & Related papers (2023-05-30T22:05:11Z) - Evaluating the Performance of Large Language Models on GAOKAO Benchmark [53.663757126289795]
This paper introduces GAOKAO-Bench, an intuitive benchmark that employs questions from the Chinese GAOKAO examination as test samples.
With human evaluation, we obtain the converted total score of LLMs, including GPT-4, ChatGPT and ERNIE-Bot.
We also use LLMs to grade the subjective questions, and find that model scores achieve a moderate level of consistency with human scores.
arXiv Detail & Related papers (2023-05-21T14:39:28Z) - FineEHR: Refine Clinical Note Representations to Improve Mortality
Prediction [3.9026461169566673]
Large-scale electronic health records provide machine learning models with an abundance of clinical text and vital sign data.
Despite the emergence of advanced Natural Language Processing (NLP) algorithms for clinical note analysis, the complex textual structure and noise present in raw clinical data have posed significant challenges.
We propose FINEEHR, a system that utilizes two representation learning techniques, namely metric learning and fine-tuning, to refine clinical note embeddings.
arXiv Detail & Related papers (2023-04-24T02:42:52Z) - Large Language Models for Healthcare Data Augmentation: An Example on
Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM)
Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z) - Auditing Algorithmic Fairness in Machine Learning for Health with
Severity-Based LOGAN [70.76142503046782]
We propose supplementing machine learning-based (ML) healthcare tools for bias with SLOGAN, an automatic tool for capturing local biases in a clinical prediction task.
LOGAN adapts an existing tool, LOcal Group biAs detectioN, by contextualizing group bias detection in patient illness severity and past medical history.
On average, SLOGAN identifies larger fairness disparities in over 75% of patient groups than LOGAN while maintaining clustering quality.
arXiv Detail & Related papers (2022-11-16T08:04:12Z) - Machine Learning and Glioblastoma: Treatment Response Monitoring
Biomarkers in 2021 [0.3266995794795542]
The aim of the systematic review was to assess recently published studies on diagnostic test accuracy of glioblastoma treatment response monitoring biomarkers in adults.
There is likely good diagnostic performance of machine learning models that use MRI features to distinguish between progression and mimics.
The diagnostic performance of ML using implicit features did not appear to be superior to ML using explicit features.
arXiv Detail & Related papers (2021-04-15T10:49:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.