Speaking the Same Language: Leveraging LLMs in Standardizing Clinical Data for AI
- URL: http://arxiv.org/abs/2408.11861v1
- Date: Fri, 16 Aug 2024 20:51:21 GMT
- Title: Speaking the Same Language: Leveraging LLMs in Standardizing Clinical Data for AI
- Authors: Arindam Sett, Somaye Hashemifar, Mrunal Yadav, Yogesh Pandit, Mohsen Hejrati,
- Abstract summary: This study delves into the adoption of large language models to address specific challenges, specifically, the standardization of healthcare data.
Our results illustrate that employing large language models significantly diminishes the necessity for manual data curation.
The proposed methodology has the propensity to expedite the integration of AI in healthcare, ameliorate the quality of patient care, whilst minimizing the time and financial resources necessary for the preparation of data for AI.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The implementation of Artificial Intelligence (AI) in the healthcare industry has garnered considerable attention, attributable to its prospective enhancement of clinical outcomes, expansion of access to superior healthcare, cost reduction, and elevation of patient satisfaction. Nevertheless, the primary hurdle that persists is related to the quality of accessible multi-modal healthcare data in conjunction with the evolution of AI methodologies. This study delves into the adoption of large language models to address specific challenges, specifically, the standardization of healthcare data. We advocate the use of these models to identify and map clinical data schemas to established data standard attributes, such as the Fast Healthcare Interoperability Resources. Our results illustrate that employing large language models significantly diminishes the necessity for manual data curation and elevates the efficacy of the data standardization process. Consequently, the proposed methodology has the propensity to expedite the integration of AI in healthcare, ameliorate the quality of patient care, whilst minimizing the time and financial resources necessary for the preparation of data for AI.
Related papers
- Health AI Developer Foundations [18.690656891269686]
Health AI Developer Foundations (HAI-DEF) is a suite of pre-trained, domain-specific foundation models, tools, and recipes to accelerate building Machine Learning for health applications.
Models cover various modalities and domains, including radiology (X-rays and computed tomography), histopathology, dermatological imaging, and audio.
These models provide domain specific embeddings that facilitate AI development with less labeled data, shorter training times, and reduced computational costs.
arXiv Detail & Related papers (2024-11-22T18:51:51Z) - The Role of Language Models in Modern Healthcare: A Comprehensive Review [2.048226951354646]
The application of large language models (LLMs) in healthcare has gained significant attention.
This review examines the trajectory of language models from their early stages to the current state-of-the-art LLMs.
arXiv Detail & Related papers (2024-09-25T12:15:15Z) - TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets [57.067409211231244]
This paper presents meticulously curated AIready datasets covering multi-modal data (e.g., drug molecule, disease code, text, categorical/numerical features) and 8 crucial prediction challenges in clinical trial design.
We provide basic validation methods for each task to ensure the datasets' usability and reliability.
We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design.
arXiv Detail & Related papers (2024-06-30T09:13:10Z) - STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical Question-Answering [58.79671189792399]
STLLaVA-Med is designed to train a policy model capable of auto-generating medical visual instruction data.
We validate the efficacy and data efficiency of STLLaVA-Med across three major medical Visual Question Answering (VQA) benchmarks.
arXiv Detail & Related papers (2024-06-28T15:01:23Z) - Data-Centric Foundation Models in Computational Healthcare: A Survey [21.53211505568379]
Foundation models (FMs) as an emerging suite of AI techniques have struck a wave of opportunities in computational healthcare.
We discuss key perspectives in AI security, assessment, and alignment with human values.
We offer a promising outlook of FM-based analytics to enhance the performance of patient outcome and clinical workflow.
arXiv Detail & Related papers (2024-01-04T08:00:32Z) - README: Bridging Medical Jargon and Lay Understanding for Patient Education through Data-Centric NLP [9.432205523734707]
We introduce a new task of automatically generating lay definitions, aiming to simplify medical terms into patient-friendly lay language.
We first created the dataset, an extensive collection of over 50,000 unique (medical term, lay definition) pairs and 300,000 mentions.
We have also engineered a data-centric Human-AI pipeline that synergizes data filtering, augmentation, and selection to improve data quality.
arXiv Detail & Related papers (2023-12-24T23:01:00Z) - TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic
Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment.
In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials.
We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z) - Large Language Models for Healthcare Data Augmentation: An Example on
Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM)
Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z) - Robust and Efficient Medical Imaging with Self-Supervision [80.62711706785834]
We present REMEDIS, a unified representation learning strategy to improve robustness and data-efficiency of medical imaging AI.
We study a diverse range of medical imaging tasks and simulate three realistic application scenarios using retrospective data.
arXiv Detail & Related papers (2022-05-19T17:34:18Z) - The Medkit-Learn(ing) Environment: Medical Decision Modelling through
Simulation [81.72197368690031]
We present a new benchmarking suite designed specifically for medical sequential decision making.
The Medkit-Learn(ing) Environment is a publicly available Python package providing simple and easy access to high-fidelity synthetic medical data.
arXiv Detail & Related papers (2021-06-08T10:38:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.