Related papers: Automating Adjudication of Cardiovascular Events Using Large Language Models

Automating Adjudication of Cardiovascular Events Using Large Language Models

URL: http://arxiv.org/abs/2503.17222v2
Date: Sun, 29 Jun 2025 20:51:41 GMT
Title: Automating Adjudication of Cardiovascular Events Using Large Language Models
Authors: Sonish Sivarajkumar, Kimia Ameri, Chuqin Li, Yanshan Wang, Min Jiang,
Abstract summary: We present a novel framework for automating the adjudication of cardiovascular events in clinical trials using Large Language Models (LLMs)<n>Using cardiovascular event-specific clinical trial data, the framework achieved an F1-score of 0.82 for event extraction and an accuracy of 0.68 for adjudication.<n>This approach demonstrates significant potential for substantially reducing adjudication time and costs while maintaining high-quality, consistent, and auditable outcomes in clinical trials.
Score: 3.7312896556790855
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Cardiovascular events, such as heart attacks and strokes, remain a leading cause of mortality globally, necessitating meticulous monitoring and adjudication in clinical trials. This process, traditionally performed manually by clinical experts, is time-consuming, resource-intensive, and prone to inter-reviewer variability, potentially introducing bias and hindering trial progress. This study addresses these critical limitations by presenting a novel framework for automating the adjudication of cardiovascular events in clinical trials using Large Language Models (LLMs). We developed a two-stage approach: first, employing an LLM-based pipeline for event information extraction from unstructured clinical data and second, using an LLM-based adjudication process guided by a Tree of Thoughts approach and clinical endpoint committee (CEC) guidelines. Using cardiovascular event-specific clinical trial data, the framework achieved an F1-score of 0.82 for event extraction and an accuracy of 0.68 for adjudication. Furthermore, we introduce the CLEART score, a novel, automated metric specifically designed for evaluating the quality of AI-generated clinical reasoning in adjudicating cardiovascular events. This approach demonstrates significant potential for substantially reducing adjudication time and costs while maintaining high-quality, consistent, and auditable outcomes in clinical trials. The reduced variability and enhanced standardization also allow for faster identification and mitigation of risks associated with cardiovascular therapies.

Related papers

LLM-Augmented Symptom Analysis for Cardiovascular Disease Risk Prediction: A Clinical NLP [2.2615384250361004]
This study introduces a novel LLM-augmented clinical NLP pipeline that employs domain-adapted large language models for symptom extraction, contextual reasoning, and correlation from free-text reports.<n> Evaluations on MIMIC-III and CARDIO-NLP datasets demonstrate improved performance in precision, recall, F1-score, and AUROC, with high clinical relevance.
arXiv Detail & Related papers (2025-07-15T07:32:16Z)
AUTOCT: Automating Interpretable Clinical Trial Prediction with LLM Agents [47.640779069547534]
AutoCT is a novel framework that combines the reasoning capabilities of large language models with the explainability of classical machine learning.<n>We show that AutoCT performs on par with or better than SOTA methods on clinical trial prediction tasks within only a limited number of self-refinement iterations.
arXiv Detail & Related papers (2025-06-04T11:50:55Z)
CardioCoT: Hierarchical Reasoning for Multimodal Survival Analysis [2.668073128790639]
We propose CardioCoT, a novel two-stage hierarchical reasoning-enhanced survival analysis framework.<n>In the first stage, we employ an evidence-augmented self-refinement mechanism to guide LLM/VLMs in generating robust hierarchical reasoning trajectories.<n>In the second stage, we integrate the reasoning trajectories with imaging data for risk model training and prediction.
arXiv Detail & Related papers (2025-05-25T15:41:18Z)
FedCVD: The First Real-World Federated Learning Benchmark on Cardiovascular Disease Data [52.55123685248105]
Cardiovascular diseases (CVDs) are currently the leading cause of death worldwide, highlighting the critical need for early diagnosis and treatment. Machine learning (ML) methods can help diagnose CVDs early, but their performance relies on access to substantial data with high quality. This paper presents the first real-world FL benchmark for cardiovascular disease detection, named FedCVD.
arXiv Detail & Related papers (2024-10-28T02:24:01Z)
Classification of Heart Sounds Using Multi-Branch Deep Convolutional Network and LSTM-CNN [2.7699831151653305]
This study develops and evaluates novel deep learning architectures that offer fast, accurate, and cost-effective methods for automatic diagnosis of cardiac diseases.<n>We propose two innovative methodologies: first, a Multi-Branch Deep Convolutional Neural Network (MBDCN) that emulates human auditory processing by utilizing diverse convolutional filter sizes and power spectrum input for enhanced feature extraction.<n>Second, a Long Short-Term Memory-Convolutional Neural (LSCN) model that integrates LSTM blocks with MBDCN to improve time-domain feature extraction.
arXiv Detail & Related papers (2024-07-15T13:02:54Z)
Establishing Rigorous and Cost-effective Clinical Trials for Artificial Intelligence Models [18.240773244542474]
A profound gap persists between artificial intelligence (AI) and clinical practice in medicine. State-of-the-art and state-of-the-practice AI model evaluations are limited to laboratory studies on medical datasets or direct clinical trials with no or solely patient-centered controls. For the first time, we emphasize the critical necessity for rigorous and cost-effective evaluation methodologies for AI models in clinical practice.
arXiv Detail & Related papers (2024-07-11T14:37:08Z)
TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets [57.067409211231244]
This paper presents meticulously curated AIready datasets covering multi-modal data (e.g., drug molecule, disease code, text, categorical/numerical features) and 8 crucial prediction challenges in clinical trial design. We provide basic validation methods for each task to ensure the datasets' usability and reliability. We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design.
arXiv Detail & Related papers (2024-06-30T09:13:10Z)
Language Interaction Network for Clinical Trial Approval Estimation [37.60098683485169]
We introduce the Language Interaction Network (LINT), a novel approach that predicts trial outcomes using only the free-text descriptions of the trials. We have rigorously tested LINT across three phases of clinical trials, where it achieved ROC-AUC scores of 0.770, 0.740, and 0.748.
arXiv Detail & Related papers (2024-04-26T14:50:59Z)
TrialDura: Hierarchical Attention Transformer for Interpretable Clinical Trial Duration Prediction [19.084936647082632]
We propose TrialDura, a machine learning-based method that estimates the duration of clinical trials using multimodal data. We encode them into Bio-BERT embeddings specifically tuned for biomedical contexts to provide a deeper and more relevant semantic understanding. Our proposed model demonstrated superior performance with a mean absolute error (MAE) of 1.04 years and a root mean square error (RMSE) of 1.39 years compared to the other models.
arXiv Detail & Related papers (2024-04-20T02:12:59Z)
AutoTrial: Prompting Language Models for Clinical Trial Design [53.630479619856516]
We present a method named AutoTrial to aid the design of clinical eligibility criteria using language models. Experiments on over 70K clinical trials verify that AutoTrial generates high-quality criteria texts.
arXiv Detail & Related papers (2023-05-19T01:04:16Z)
Machine Learning to Support Triage of Children at Risk for Epileptic Seizures in the Pediatric Intensive Care Unit [5.708335717084799]
Epileptic seizures are relatively common in critically-ill children admitted to the pediatric intensive care unit (PICU) Children deemed at risk for seizures within the PICU are monitored using continuous-electroencephalogram (cEEG) This research aims to develop a computer aided tool to improve seizures risk assessment in critically-ill children.
arXiv Detail & Related papers (2022-05-11T10:24:58Z)
Clinical Temporal Relation Extraction with Probabilistic Soft Logic Regularization and Global Inference [50.029659413650194]
Existing methods either require expensive feature engineering or are incapable of modeling the global dependencies among the events. In this paper, we propose a novel method, Clinical Temporal ReLation Exaction with Probabilistic Soft Logic Regularization and Global Inference.
arXiv Detail & Related papers (2020-12-16T08:23:03Z)
MIA-Prognosis: A Deep Learning Framework to Predict Therapy Response [58.0291320452122]
This paper aims at a unified deep learning approach to predict patient prognosis and therapy response. We formalize the prognosis modeling as a multi-modal asynchronous time series classification task. Our predictive model could further stratify low-risk and high-risk patients in terms of long-term survival.
arXiv Detail & Related papers (2020-10-08T15:30:17Z)
Prediction of the onset of cardiovascular diseases from electronic health records using multi-task gated recurrent units [51.14334174570822]
We propose a multi-task recurrent neural network with attention mechanism for predicting cardiovascular events from electronic health records. The proposed approach is compared to a standard clinical risk predictor (QRISK) and machine learning alternatives using 5-year data from a NHS Foundation Trust.
arXiv Detail & Related papers (2020-07-16T17:43:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.