Related papers: TrialGraph: Machine Intelligence Enabled Insight from Graph Modelling of Clinical Trials

TrialGraph: Machine Intelligence Enabled Insight from Graph Modelling of Clinical Trials

URL: http://arxiv.org/abs/2112.08211v1
Date: Wed, 15 Dec 2021 15:36:57 GMT
Title: TrialGraph: Machine Intelligence Enabled Insight from Graph Modelling of Clinical Trials
Authors: Christopher Yacoumatos, Stefano Bragaglia, Anshul Kanakia, Nils Svang{\aa}rd, Jonathan Mangion, Claire Donoghue, Jim Weatherall, Faisal M. Khan, Khader Shameer
Abstract summary: We introduce a curated clinical trial data set compiled from the CT.gov, AACT and TrialTrove databases (n=1191 trials; representing one million patients) We then detail the mathematical basis and implementation of a selection of graph machine learning algorithms. We trained these models to predict side effect information for a clinical trial given information on the disease, existing medical conditions, and treatment.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A major impediment to successful drug development is the complexity, cost, and scale of clinical trials. The detailed internal structure of clinical trial data can make conventional optimization difficult to achieve. Recent advances in machine learning, specifically graph-structured data analysis, have the potential to enable significant progress in improving the clinical trial design. TrialGraph seeks to apply these methodologies to produce a proof-of-concept framework for developing models which can aid drug development and benefit patients. In this work, we first introduce a curated clinical trial data set compiled from the CT.gov, AACT and TrialTrove databases (n=1191 trials; representing one million patients) and describe the conversion of this data to graph-structured formats. We then detail the mathematical basis and implementation of a selection of graph machine learning algorithms, which typically use standard machine classifiers on graph data embedded in a low-dimensional feature space. We trained these models to predict side effect information for a clinical trial given information on the disease, existing medical conditions, and treatment. The MetaPath2Vec algorithm performed exceptionally well, with standard Logistic Regression, Decision Tree, Random Forest, Support Vector, and Neural Network classifiers exhibiting typical ROC-AUC scores of 0.85, 0.68, 0.86, 0.80, and 0.77, respectively. Remarkably, the best performing classifiers could only produce typical ROC-AUC scores of 0.70 when trained on equivalent array-structured data. Our work demonstrates that graph modelling can significantly improve prediction accuracy on appropriate datasets. Successive versions of the project that refine modelling assumptions and incorporate more data types can produce excellent predictors with real-world applications in drug development.

Related papers

Improving Cardiac Risk Prediction Using Data Generation Techniques [37.94487163156369]
This work proposes an architecture for the synthesis of realistic clinical records that are coherent with real-world observations.<n>The primary objective is to increase the size and diversity of the available datasets in order to enhance the performance of cardiac risk prediction models.
arXiv Detail & Related papers (2025-12-19T10:17:00Z)
A Comprehensive Machine Learning Framework for Heart Disease Prediction: Performance Evaluation and Future Perspectives [0.0]
This study presents a machine learning-based framework for heart disease prediction using the heart-disease dataset.<n>The proposed model demonstrates strong potential for aiding clinical decision-making by effectively predicting heart disease.
arXiv Detail & Related papers (2025-05-15T05:13:38Z)
Zero-shot Medical Event Prediction Using a Generative Pre-trained Transformer on Electronic Health Records [8.575985305475355]
We show that a foundation model trained on EHRs can perform predictive tasks in a zero-shot manner. Unlike supervised approaches requiring extensive labeled data, our method enables the model to forecast a next medical event purely from a pretraining knowledge.
arXiv Detail & Related papers (2025-03-07T19:26:47Z)
Advancing clinical trial outcomes using deep learning and predictive modelling: bridging precision medicine and patient-centered care [0.0]
Deep learning and predictive modelling have emerged as transformative tools for optimizing clinical trial design, patient recruitment, and real-time monitoring. This study explores the application of deep learning techniques, such as convolutional neural networks [CNNs] and transformerbased models, to stratify patients. Predictive modelling approaches, including survival analysis and time-series forecasting, are employed to predict trial outcomes, enhancing efficiency and reducing trial failure rates.
arXiv Detail & Related papers (2024-12-09T23:20:08Z)
Benchmarking Embedding Aggregation Methods in Computational Pathology: A Clinical Data Perspective [32.93871326428446]
Recent advances in artificial intelligence (AI) are revolutionizing medical imaging and computational pathology. A constant challenge in the analysis of digital Whole Slide Images (WSIs) is the problem of aggregating tens of thousands of tile-level image embeddings to a slide-level representation. This study conducts a benchmarking analysis of ten slide-level aggregation techniques across nine clinically relevant tasks.
arXiv Detail & Related papers (2024-07-10T17:00:57Z)
TRIALSCOPE: A Unifying Causal Framework for Scaling Real-World Evidence Generation with Biomedical Language Models [22.046231408373522]
We present TRIALSCOPE, a unifying framework for distilling real-world evidence from observational data. We show that TRIALSCOPE can produce high-quality structuring of real-world data and generates comparable results to marquee cancer trials.
arXiv Detail & Related papers (2023-11-02T15:15:47Z)
TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment. In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials. We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z)
Foresight -- Deep Generative Modelling of Patient Timelines using Electronic Health Records [46.024501445093755]
Temporal modelling of medical history can be used to forecast and simulate future events, estimate risk, suggest alternative diagnoses or forecast complications. We present Foresight, a novel GPT3-based pipeline that uses NER+L tools (i.e. MedCAT) to convert document text into structured, coded concepts.
arXiv Detail & Related papers (2022-12-13T19:06:00Z)
Unsupervised pre-training of graph transformers on patient population graphs [48.02011627390706]
We propose a graph-transformer-based network to handle heterogeneous clinical data. We show the benefit of our pre-training method in a self-supervised and a transfer learning setting.
arXiv Detail & Related papers (2022-07-21T16:59:09Z)
Deep Learning with Heterogeneous Graph Embeddings for Mortality Prediction from Electronic Health Records [2.2859570135269625]
We train a Heterogeneous Graph Model (HGM) on Electronic Health Record data and use the resulting embedding vector as additional information added to a Conal Neural Network (CNN) model for predicting in-hospital mortality. We find that adding HGM to a CNN model increases the mortality prediction accuracy up to 4%.
arXiv Detail & Related papers (2020-12-28T02:27:09Z)
Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients. We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks. Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
"Name that manufacturer". Relating image acquisition bias with task complexity when training deep learning models: experiments on head CT [0.0]
We analyze how the distribution of scanner manufacturers in a dataset can contribute to the overall bias of deep learning models. We demonstrate that CNNs can learn to distinguish the imaging scanner manufacturer and that this bias can substantially impact model performance.
arXiv Detail & Related papers (2020-08-19T16:05:58Z)
Trajectories, bifurcations and pseudotime in large clinical datasets: applications to myocardial infarction and diabetes data [94.37521840642141]
We suggest a semi-supervised methodology for the analysis of large clinical datasets, characterized by mixed data types and missing values. The methodology is based on application of elastic principal graphs which can address simultaneously the tasks of dimensionality reduction, data visualization, clustering, feature selection and quantifying the geodesic distances (pseudotime) in partially ordered sequences of observations.
arXiv Detail & Related papers (2020-07-07T21:04:55Z)
Hemogram Data as a Tool for Decision-making in COVID-19 Management: Applications to Resource Scarcity Scenarios [62.997667081978825]
COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure. This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients. Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity.
arXiv Detail & Related papers (2020-05-10T01:45:03Z)
Self-Training with Improved Regularization for Sample-Efficient Chest X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios. Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.