PHEONA: An Evaluation Framework for Large Language Model-based Approaches to Computational Phenotyping
- URL: http://arxiv.org/abs/2503.19265v2
- Date: Mon, 07 Apr 2025 17:43:00 GMT
- Title: PHEONA: An Evaluation Framework for Large Language Model-based Approaches to Computational Phenotyping
- Authors: Sarah Pungitore, Shashank Yadav, Vignesh Subbian,
- Abstract summary: Computational phenotyping is essential for biomedical research but often requires significant time and resources.<n>We developed an evaluation framework, Evaluation of PHEnotyping for Observational Health Data, that outlines context-specific considerations.<n>From the sample concepts tested, we achieved high classification accuracy, suggesting the potential for LLM-based methods to improve computational phenotyping processes.
- Score: 1.1363669527515645
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Computational phenotyping is essential for biomedical research but often requires significant time and resources, especially since traditional methods typically involve extensive manual data review. While machine learning and natural language processing advancements have helped, further improvements are needed. Few studies have explored using Large Language Models (LLMs) for these tasks despite known advantages of LLMs for text-based tasks. To facilitate further research in this area, we developed an evaluation framework, Evaluation of PHEnotyping for Observational Health Data (PHEONA), that outlines context-specific considerations. We applied and demonstrated PHEONA on concept classification, a specific task within a broader phenotyping process for Acute Respiratory Failure (ARF) respiratory support therapies. From the sample concepts tested, we achieved high classification accuracy, suggesting the potential for LLM-based methods to improve computational phenotyping processes.
Related papers
- Dancing with Critiques: Enhancing LLM Reasoning with Stepwise Natural Language Self-Critique [66.94905631175209]
We propose a novel inference-time scaling approach -- stepwise natural language self-critique (PANEL)
It employs self-generated natural language critiques as feedback to guide the step-level search process.
This approach bypasses the need for task-specific verifiers and the associated training overhead.
arXiv Detail & Related papers (2025-03-21T17:59:55Z) - Diagnostic Reasoning in Natural Language: Computational Model and Application [68.47402386668846]
We investigate diagnostic abductive reasoning (DAR) in the context of language-grounded tasks (NL-DAR)
We propose a novel modeling framework for NL-DAR based on Pearl's structural causal models.
We use the resulting dataset to investigate the human decision-making process in NL-DAR.
arXiv Detail & Related papers (2024-09-09T06:55:37Z) - A Large Language Model Outperforms Other Computational Approaches to the High-Throughput Phenotyping of Physician Notes [0.0]
This study compares three computational approaches to high- throughput phenotyping.
A Large Language Model (LLM) incorporating generative AI, a Natural Language Processing (NLP) approach utilizing deep learning for span categorization, and a hybrid approach combining word vectors with machine learning.
The approach that implemented GPT-4 (a Large Language Model) demonstrated superior performance.
arXiv Detail & Related papers (2024-06-20T22:05:34Z) - BEACON: Benchmark for Comprehensive RNA Tasks and Language Models [60.02663015002029]
We introduce the first comprehensive RNA benchmark BEACON (textbfBEnchmtextbfArk for textbfCOmprehensive RtextbfNA Task and Language Models).<n>First, BEACON comprises 13 distinct tasks derived from extensive previous work covering structural analysis, functional studies, and engineering applications.<n>Second, we examine a range of models, including traditional approaches like CNNs, as well as advanced RNA foundation models based on language models, offering valuable insights into the task-specific performances of these models.<n>Third, we investigate the vital RNA language model components
arXiv Detail & Related papers (2024-06-14T19:39:19Z) - Clinical information extraction for Low-resource languages with Few-shot learning using Pre-trained language models and Prompting [12.166472806042592]
Automatic extraction of medical information from clinical documents poses several challenges.
Recent advances in domain-adaptation and prompting methods showed promising results with minimal training data.
We demonstrate that a lightweight, domain-adapted pretrained model, prompted with just 20 shots, outperforms a traditional classification model by 30.5% accuracy.
arXiv Detail & Related papers (2024-03-20T08:01:33Z) - An evaluation of GPT models for phenotype concept recognition [0.4715973318447338]
We examine the performance of the latest Generative Pre-trained Transformer (GPT) models for clinical phenotyping and phenotype annotation.
Our results show that, with an appropriate setup, these models can achieve state of the art performance.
arXiv Detail & Related papers (2023-09-29T12:06:55Z) - Interpretable Medical Diagnostics with Structured Data Extraction by
Large Language Models [59.89454513692417]
Tabular data is often hidden in text, particularly in medical diagnostic reports.
We propose a novel, simple, and effective methodology for extracting structured tabular data from textual medical reports, called TEMED-LLM.
We demonstrate that our approach significantly outperforms state-of-the-art text classification models in medical diagnostics.
arXiv Detail & Related papers (2023-06-08T09:12:28Z) - Time Associated Meta Learning for Clinical Prediction [78.99422473394029]
We propose a novel time associated meta learning (TAML) method to make effective predictions at multiple future time points.
To address the sparsity problem after task splitting, TAML employs a temporal information sharing strategy to augment the number of positive samples.
We demonstrate the effectiveness of TAML on multiple clinical datasets, where it consistently outperforms a range of strong baselines.
arXiv Detail & Related papers (2023-03-05T03:54:54Z) - Large Language Models for Biomedical Knowledge Graph Construction:
Information extraction from EMR notes [0.0]
We propose an end-to-end machine learning solution based on large language models (LLMs)
The entities used in the KG construction process are diseases, factors, treatments, as well as manifestations that coexist with the patient while experiencing the disease.
The application of the proposed methodology is demonstrated on age-related macular degeneration.
arXiv Detail & Related papers (2023-01-29T15:52:33Z) - Patient Cohort Retrieval using Transformer Language Models [7.784753717089568]
We propose a framework for retrieving patient cohorts using neural language models without the need of explicit feature engineering and domain expertise.
We find that a majority of our models outperform the BM25 baseline method on various evaluation metrics.
arXiv Detail & Related papers (2020-09-10T19:40:41Z) - Domain-Specific Language Model Pretraining for Biomedical Natural
Language Processing [73.37262264915739]
We show that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains.
Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks.
arXiv Detail & Related papers (2020-07-31T00:04:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.