High-Throughput Phenotyping of Clinical Text Using Large Language Models
- URL: http://arxiv.org/abs/2408.01214v1
- Date: Fri, 2 Aug 2024 12:00:00 GMT
- Title: High-Throughput Phenotyping of Clinical Text Using Large Language Models
- Authors: Daniel B. Hier, S. Ilyas Munzir, Anne Stahlfeld, Tayo Obafemi-Ajayi, Michael D. Carrithers,
- Abstract summary: GPT-4 surpasses GPT-3.5-Turbo in identifying, categorizing, and normalizing signs.
GPT-4 results in high performance and generalizability across several phenotyping tasks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: High-throughput phenotyping automates the mapping of patient signs to standardized ontology concepts and is essential for precision medicine. This study evaluates the automation of phenotyping of clinical summaries from the Online Mendelian Inheritance in Man (OMIM) database using large language models. Due to their rich phenotype data, these summaries can be surrogates for physician notes. We conduct a performance comparison of GPT-4 and GPT-3.5-Turbo. Our results indicate that GPT-4 surpasses GPT-3.5-Turbo in identifying, categorizing, and normalizing signs, achieving concordance with manual annotators comparable to inter-rater agreement. Despite some limitations in sign normalization, the extensive pre-training of GPT-4 results in high performance and generalizability across several phenotyping tasks while obviating the need for manually annotated training data. Large language models are expected to be the dominant method for automating high-throughput phenotyping of clinical text.
Related papers
- A Large Language Model Outperforms Other Computational Approaches to the High-Throughput Phenotyping of Physician Notes [0.0]
This study compares three computational approaches to high- throughput phenotyping.
A Large Language Model (LLM) incorporating generative AI, a Natural Language Processing (NLP) approach utilizing deep learning for span categorization, and a hybrid approach combining word vectors with machine learning.
The approach that implemented GPT-4 (a Large Language Model) demonstrated superior performance.
arXiv Detail & Related papers (2024-06-20T22:05:34Z) - High Throughput Phenotyping of Physician Notes with Large Language and
Hybrid NLP Models [0.0]
Deep phenotyping is the detailed description of patient signs and symptoms using concepts from an ontology.
In this study, we demonstrate that a large language model and a hybrid NLP model can perform high throughput phenotyping on physician notes with high accuracy.
arXiv Detail & Related papers (2024-03-09T14:02:59Z) - ExtractGPT: Exploring the Potential of Large Language Models for Product Attribute Value Extraction [52.14681890859275]
E-commerce platforms require structured product data in the form of attribute-value pairs.
BERT-based extraction methods require large amounts of task-specific training data.
This paper explores using large language models (LLMs) as a more training-data efficient and robust alternative.
arXiv Detail & Related papers (2023-10-19T07:39:00Z) - An evaluation of GPT models for phenotype concept recognition [0.4715973318447338]
We examine the performance of the latest Generative Pre-trained Transformer (GPT) models for clinical phenotyping and phenotype annotation.
Our results show that, with an appropriate setup, these models can achieve state of the art performance.
arXiv Detail & Related papers (2023-09-29T12:06:55Z) - Enhancing Phenotype Recognition in Clinical Notes Using Large Language
Models: PhenoBCBERT and PhenoGPT [11.20254354103518]
We developed two types of models: PhenoBCBERT, a BERT-based model, and PhenoGPT, a GPT-based model.
We found that our methods can extract more phenotype concepts, including novel ones not characterized by HPO.
arXiv Detail & Related papers (2023-08-11T03:40:22Z) - Exploring the Trade-Offs: Unified Large Language Models vs Local
Fine-Tuned Models for Highly-Specific Radiology NLI Task [49.50140712943701]
We evaluate the performance of ChatGPT/GPT-4 on a radiology NLI task and compare it to other models fine-tuned specifically on task-related data samples.
We also conduct a comprehensive investigation on ChatGPT/GPT-4's reasoning ability by introducing varying levels of inference difficulty.
arXiv Detail & Related papers (2023-04-18T17:21:48Z) - GPT-4 Technical Report [116.90398195245983]
GPT-4 is a large-scale, multimodal model which can accept image and text inputs and produce text outputs.
It exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers.
arXiv Detail & Related papers (2023-03-15T17:15:04Z) - Textual Data Augmentation for Patient Outcomes Prediction [67.72545656557858]
We propose a novel data augmentation method to generate artificial clinical notes in patients' Electronic Health Records.
We fine-tune the generative language model GPT-2 to synthesize labeled text with the original training data.
We evaluate our method on the most common patient outcome, i.e., the 30-day readmission rate.
arXiv Detail & Related papers (2022-11-13T01:07:23Z) - News Summarization and Evaluation in the Era of GPT-3 [73.48220043216087]
We study how GPT-3 compares against fine-tuned models trained on large summarization datasets.
We show that not only do humans overwhelmingly prefer GPT-3 summaries, prompted using only a task description, but these also do not suffer from common dataset-specific issues such as poor factuality.
arXiv Detail & Related papers (2022-09-26T01:04:52Z) - Hybrid deep learning methods for phenotype prediction from clinical
notes [4.866431869728018]
This paper proposes a novel hybrid model for automatically extracting patient phenotypes using natural language processing and deep learning models.
The proposed hybrid model is based on a neural bidirectional sequence model (BiLSTM or BiGRU) and a Convolutional Neural Network (CNN) for identifying patient's phenotypes in discharge reports.
arXiv Detail & Related papers (2021-08-16T05:57:28Z) - An Interpretable End-to-end Fine-tuning Approach for Long Clinical Text [72.62848911347466]
Unstructured clinical text in EHRs contains crucial information for applications including decision support, trial matching, and retrospective research.
Recent work has applied BERT-based models to clinical information extraction and text classification, given these models' state-of-the-art performance in other NLP domains.
In this work, we propose a novel fine-tuning approach called SnipBERT. Instead of using entire notes, SnipBERT identifies crucial snippets and feeds them into a truncated BERT-based model in a hierarchical manner.
arXiv Detail & Related papers (2020-11-12T17:14:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.