TrialGraph: Machine Intelligence Enabled Insight from Graph Modelling of
Clinical Trials
- URL: http://arxiv.org/abs/2112.08211v1
- Date: Wed, 15 Dec 2021 15:36:57 GMT
- Title: TrialGraph: Machine Intelligence Enabled Insight from Graph Modelling of
Clinical Trials
- Authors: Christopher Yacoumatos, Stefano Bragaglia, Anshul Kanakia, Nils
Svang{\aa}rd, Jonathan Mangion, Claire Donoghue, Jim Weatherall, Faisal M.
Khan, Khader Shameer
- Abstract summary: We introduce a curated clinical trial data set compiled from the CT.gov, AACT and TrialTrove databases (n=1191 trials; representing one million patients)
We then detail the mathematical basis and implementation of a selection of graph machine learning algorithms.
We trained these models to predict side effect information for a clinical trial given information on the disease, existing medical conditions, and treatment.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A major impediment to successful drug development is the complexity, cost,
and scale of clinical trials. The detailed internal structure of clinical trial
data can make conventional optimization difficult to achieve. Recent advances
in machine learning, specifically graph-structured data analysis, have the
potential to enable significant progress in improving the clinical trial
design. TrialGraph seeks to apply these methodologies to produce a
proof-of-concept framework for developing models which can aid drug development
and benefit patients. In this work, we first introduce a curated clinical trial
data set compiled from the CT.gov, AACT and TrialTrove databases (n=1191
trials; representing one million patients) and describe the conversion of this
data to graph-structured formats. We then detail the mathematical basis and
implementation of a selection of graph machine learning algorithms, which
typically use standard machine classifiers on graph data embedded in a
low-dimensional feature space. We trained these models to predict side effect
information for a clinical trial given information on the disease, existing
medical conditions, and treatment. The MetaPath2Vec algorithm performed
exceptionally well, with standard Logistic Regression, Decision Tree, Random
Forest, Support Vector, and Neural Network classifiers exhibiting typical
ROC-AUC scores of 0.85, 0.68, 0.86, 0.80, and 0.77, respectively. Remarkably,
the best performing classifiers could only produce typical ROC-AUC scores of
0.70 when trained on equivalent array-structured data. Our work demonstrates
that graph modelling can significantly improve prediction accuracy on
appropriate datasets. Successive versions of the project that refine modelling
assumptions and incorporate more data types can produce excellent predictors
with real-world applications in drug development.
Related papers
- Benchmarking Embedding Aggregation Methods in Computational Pathology: A Clinical Data Perspective [32.93871326428446]
Recent advances in artificial intelligence (AI) are revolutionizing medical imaging and computational pathology.
A constant challenge in the analysis of digital Whole Slide Images (WSIs) is the problem of aggregating tens of thousands of tile-level image embeddings to a slide-level representation.
This study conducts a benchmarking analysis of ten slide-level aggregation techniques across nine clinically relevant tasks.
arXiv Detail & Related papers (2024-07-10T17:00:57Z) - TRIALSCOPE: A Unifying Causal Framework for Scaling Real-World Evidence
Generation with Biomedical Language Models [22.046231408373522]
We present TRIALSCOPE, a unifying framework for distilling real-world evidence from observational data.
We show that TRIALSCOPE can produce high-quality structuring of real-world data and generates comparable results to marquee cancer trials.
arXiv Detail & Related papers (2023-11-02T15:15:47Z) - TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic
Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment.
In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials.
We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z) - Foresight -- Deep Generative Modelling of Patient Timelines using
Electronic Health Records [46.024501445093755]
Temporal modelling of medical history can be used to forecast and simulate future events, estimate risk, suggest alternative diagnoses or forecast complications.
We present Foresight, a novel GPT3-based pipeline that uses NER+L tools (i.e. MedCAT) to convert document text into structured, coded concepts.
arXiv Detail & Related papers (2022-12-13T19:06:00Z) - Unsupervised pre-training of graph transformers on patient population
graphs [48.02011627390706]
We propose a graph-transformer-based network to handle heterogeneous clinical data.
We show the benefit of our pre-training method in a self-supervised and a transfer learning setting.
arXiv Detail & Related papers (2022-07-21T16:59:09Z) - Deep Learning with Heterogeneous Graph Embeddings for Mortality
Prediction from Electronic Health Records [2.2859570135269625]
We train a Heterogeneous Graph Model (HGM) on Electronic Health Record data and use the resulting embedding vector as additional information added to a Conal Neural Network (CNN) model for predicting in-hospital mortality.
We find that adding HGM to a CNN model increases the mortality prediction accuracy up to 4%.
arXiv Detail & Related papers (2020-12-28T02:27:09Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - "Name that manufacturer". Relating image acquisition bias with task
complexity when training deep learning models: experiments on head CT [0.0]
We analyze how the distribution of scanner manufacturers in a dataset can contribute to the overall bias of deep learning models.
We demonstrate that CNNs can learn to distinguish the imaging scanner manufacturer and that this bias can substantially impact model performance.
arXiv Detail & Related papers (2020-08-19T16:05:58Z) - Trajectories, bifurcations and pseudotime in large clinical datasets:
applications to myocardial infarction and diabetes data [94.37521840642141]
We suggest a semi-supervised methodology for the analysis of large clinical datasets, characterized by mixed data types and missing values.
The methodology is based on application of elastic principal graphs which can address simultaneously the tasks of dimensionality reduction, data visualization, clustering, feature selection and quantifying the geodesic distances (pseudotime) in partially ordered sequences of observations.
arXiv Detail & Related papers (2020-07-07T21:04:55Z) - Hemogram Data as a Tool for Decision-making in COVID-19 Management:
Applications to Resource Scarcity Scenarios [62.997667081978825]
COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure.
This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients.
Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity.
arXiv Detail & Related papers (2020-05-10T01:45:03Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.