Clinical Trials Ontology Engineering with Large Language Models
- URL: http://arxiv.org/abs/2412.14387v1
- Date: Wed, 18 Dec 2024 22:40:52 GMT
- Title: Clinical Trials Ontology Engineering with Large Language Models
- Authors: Berkan Çakır,
- Abstract summary: This paper proposes a simple yet effective methodology to extract and integrate clinical trial data in a cost-effective manner.<n>Findings suggest that large language models (LLM) are a viable option to automate this process from a cost and time perspective.<n>This study underscores significant implications for medical research where real-time data integration from clinical trials could become the norm.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Managing clinical trial information is currently a significant challenge for the medical industry, as traditional methods are both time-consuming and costly. This paper proposes a simple yet effective methodology to extract and integrate clinical trial data in a cost-effective and time-efficient manner. Allowing the medical industry to stay up-to-date with medical developments. Comparing time, cost, and quality of the ontologies created by humans, GPT3.5, GPT4, and Llama3 (8b & 70b). Findings suggest that large language models (LLM) are a viable option to automate this process both from a cost and time perspective. This study underscores significant implications for medical research where real-time data integration from clinical trials could become the norm.
Related papers
- Simulating Viva Voce Examinations to Evaluate Clinical Reasoning in Large Language Models [51.91760712805404]
We introduce VivaBench, a benchmark for evaluating sequential clinical reasoning in large language models (LLMs)<n>Our dataset consists of 1762 physician-curated clinical vignettes structured as interactive scenarios that simulate a (oral) examination in medical training.<n>Our analysis identified several failure modes that mirror common cognitive errors in clinical practice.
arXiv Detail & Related papers (2025-10-11T16:24:35Z) - Leveraging AI to Accelerate Medical Data Cleaning: A Comparative Study of AI-Assisted vs. Traditional Methods [3.2666593942117688]
Octozi is an artificial intelligence-assisted platform that combines large language models with domain-specifics to transform medical data review.<n>Economic analysis of a representative Phase III oncology trial reveals potential cost savings of $5.1 million.
arXiv Detail & Related papers (2025-08-07T15:49:32Z) - AUTOCT: Automating Interpretable Clinical Trial Prediction with LLM Agents [47.640779069547534]
AutoCT is a novel framework that combines the reasoning capabilities of large language models with the explainability of classical machine learning.<n>We show that AutoCT performs on par with or better than SOTA methods on clinical trial prediction tasks within only a limited number of self-refinement iterations.
arXiv Detail & Related papers (2025-06-04T11:50:55Z) - TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets [57.067409211231244]
This paper presents meticulously curated AIready datasets covering multi-modal data (e.g., drug molecule, disease code, text, categorical/numerical features) and 8 crucial prediction challenges in clinical trial design.
We provide basic validation methods for each task to ensure the datasets' usability and reliability.
We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design.
arXiv Detail & Related papers (2024-06-30T09:13:10Z) - STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical Question-Answering [58.79671189792399]
STLLaVA-Med is designed to train a policy model capable of auto-generating medical visual instruction data.
We validate the efficacy and data efficiency of STLLaVA-Med across three major medical Visual Question Answering (VQA) benchmarks.
arXiv Detail & Related papers (2024-06-28T15:01:23Z) - PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching using Large Language Models [4.438101430231511]
We present the first, end-to-end large-scale empirical evaluation of clinical trial matching using real-world EHRs.
Our study showcases the capability of LLMs to accurately match patients with appropriate clinical trials.
arXiv Detail & Related papers (2024-04-23T22:33:19Z) - TrialDura: Hierarchical Attention Transformer for Interpretable Clinical Trial Duration Prediction [19.084936647082632]
We propose TrialDura, a machine learning-based method that estimates the duration of clinical trials using multimodal data.
We encode them into Bio-BERT embeddings specifically tuned for biomedical contexts to provide a deeper and more relevant semantic understanding.
Our proposed model demonstrated superior performance with a mean absolute error (MAE) of 1.04 years and a root mean square error (RMSE) of 1.39 years compared to the other models.
arXiv Detail & Related papers (2024-04-20T02:12:59Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - SoftTiger: A Clinical Foundation Model for Healthcare Workflows [5.181665205189493]
We introduce SoftTiger, a clinical large language model (CLaM) designed as a foundation model for healthcare.
We collect and annotate data for three subtasks, namely, international patient summary, clinical impression and medical encounter.
We supervised fine-tuned a state-of-the-art LLM using public and credentialed clinical data.
arXiv Detail & Related papers (2024-03-01T04:39:16Z) - Scaling Clinical Trial Matching Using Large Language Models: A Case
Study in Oncology [17.214664001970526]
We conduct a systematic study on scaling clinical trial matching using large language models (LLMs)
Our study is grounded in a clinical trial matching system currently in test deployment at a large U.S. health network.
arXiv Detail & Related papers (2023-08-04T07:51:15Z) - Self-Verification Improves Few-Shot Clinical Information Extraction [73.6905567014859]
Large language models (LLMs) have shown the potential to accelerate clinical curation via few-shot in-context learning.
They still struggle with issues regarding accuracy and interpretability, especially in mission-critical domains such as health.
Here, we explore a general mitigation framework using self-verification, which leverages the LLM to provide provenance for its own extraction and check its own outputs.
arXiv Detail & Related papers (2023-05-30T22:05:11Z) - Large Language Models for Healthcare Data Augmentation: An Example on
Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM)
Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z) - How to Leverage Multimodal EHR Data for Better Medical Predictions? [13.401754962583771]
The complexity of electronic health records ( EHR) data is a challenge for the application of deep learning.
In this paper, we first extract the accompanying clinical notes from EHR and propose a method to integrate these data.
The results on two medical prediction tasks show that our fused model with different data outperforms the state-of-the-art method.
arXiv Detail & Related papers (2021-10-29T13:26:05Z) - MIA-Prognosis: A Deep Learning Framework to Predict Therapy Response [58.0291320452122]
This paper aims at a unified deep learning approach to predict patient prognosis and therapy response.
We formalize the prognosis modeling as a multi-modal asynchronous time series classification task.
Our predictive model could further stratify low-risk and high-risk patients in terms of long-term survival.
arXiv Detail & Related papers (2020-10-08T15:30:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.