Machine Learning to Promote Translational Research: Predicting Patent
and Clinical Trial Inclusion in Dementia Research
- URL: http://arxiv.org/abs/2401.05145v1
- Date: Wed, 10 Jan 2024 13:25:49 GMT
- Title: Machine Learning to Promote Translational Research: Predicting Patent
and Clinical Trial Inclusion in Dementia Research
- Authors: Matilda Beinat, Julian Beinat, Mohammed Shoaib, Jorge Gomez Magenti
- Abstract summary: Projected to impact 1.6 million people in the UK by 2040 and costing pounds25 billion annually, dementia presents a growing challenge to society.
We used the Dimensions database to extract data from 43,091 UK dementia research publications between the years 1990-2023.
For patent predictions, an Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.84 and 77.17% accuracy; for clinical trial predictions, an AUROC of 0.81 and 75.11% accuracy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Projected to impact 1.6 million people in the UK by 2040 and costing
{\pounds}25 billion annually, dementia presents a growing challenge to society.
This study, a pioneering effort to predict the translational potential of
dementia research using machine learning, hopes to address the slow translation
of fundamental discoveries into practical applications despite dementia's
significant societal and economic impact. We used the Dimensions database to
extract data from 43,091 UK dementia research publications between the years
1990-2023, specifically metadata (authors, publication year etc.), concepts
mentioned in the paper, and the paper abstract. To prepare the data for machine
learning we applied methods such as one hot encoding and/or word embeddings. We
trained a CatBoost Classifier to predict if a publication will be cited in a
future patent or clinical trial. We trained several model variations. The model
combining metadata, concept, and abstract embeddings yielded the highest
performance: for patent predictions, an Area Under the Receiver Operating
Characteristic Curve (AUROC) of 0.84 and 77.17% accuracy; for clinical trial
predictions, an AUROC of 0.81 and 75.11% accuracy. The results demonstrate that
integrating machine learning within current research methodologies can uncover
overlooked publications, expediting the identification of promising research
and potentially transforming dementia research by predicting real-world impact
and guiding translational strategies.
Related papers
- Systematic Review: Text Processing Algorithms in Machine Learning and Deep Learning for Mental Health Detection on Social Media [0.037693031068634524]
This systematic review evaluates machine learning models for depression detection on social media.
Significant biases impacting model reliability and generalizability were found.
Only 23% of studies explicitly addressed linguistic nuances like negations, crucial for accurate sentiment analysis.
arXiv Detail & Related papers (2024-10-21T17:05:50Z) - Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval [61.70489848327436]
KARE is a novel framework that integrates knowledge graph (KG) community-level retrieval with large language models (LLMs) reasoning.
Extensive experiments demonstrate that KARE outperforms leading models by up to 10.8-15.0% on MIMIC-III and 12.6-12.7% on MIMIC-IV for mortality and readmission predictions.
arXiv Detail & Related papers (2024-10-06T18:46:28Z) - TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets [57.067409211231244]
This paper presents meticulously curated AIready datasets covering multi-modal data (e.g., drug molecule, disease code, text, categorical/numerical features) and 8 crucial prediction challenges in clinical trial design.
We provide basic validation methods for each task to ensure the datasets' usability and reliability.
We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design.
arXiv Detail & Related papers (2024-06-30T09:13:10Z) - Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - Incomplete Multimodal Learning for Complex Brain Disorders Prediction [65.95783479249745]
We propose a new incomplete multimodal data integration approach that employs transformers and generative adversarial networks.
We apply our new method to predict cognitive degeneration and disease outcomes using the multimodal imaging genetic data from Alzheimer's Disease Neuroimaging Initiative cohort.
arXiv Detail & Related papers (2023-05-25T16:29:16Z) - Machine Learning Techniques for Predicting the Short-Term Outcome of
Resective Surgery in Lesional-Drug Resistance Epilepsy [1.759008116536278]
Seven dif-ferent categorization algorithms were used to analyze the data.
The support vector machine (SVM) with the linear kernel yielded 76.1% in terms of accuracy.
arXiv Detail & Related papers (2023-02-10T13:04:47Z) - Research Trends and Applications of Data Augmentation Algorithms [77.34726150561087]
We identify the main areas of application of data augmentation algorithms, the types of algorithms used, significant research trends, their progression over time and research gaps in data augmentation literature.
We expect readers to understand the potential of data augmentation, as well as identify future research directions and open questions within data augmentation research.
arXiv Detail & Related papers (2022-07-18T11:38:32Z) - Deep forecasting of translational impact in medical research [1.8130872753848115]
We develop a suite of representational and discriminative mathematical models of multi-scale publication data.
We show that citations are only moderately predictive of translational impact as judged by inclusion in patents, guidelines, or policy documents.
We argue that content-based models of impact are superior in performance to conventional, citation-based measures.
arXiv Detail & Related papers (2021-10-17T19:29:41Z) - Machine learning for modeling the progression of Alzheimer disease
dementia using clinical data: a systematic literature review [2.8136734847819773]
Alzheimer disease (AD) is the most common cause of dementia, a syndrome characterized by cognitive impairment severe enough to interfere with activities of daily life.
We searched for articles published between January 1, 2010, and May 31, 2020, in PubMed, Scopus, ScienceDirect, IEEE Explore Digital Library, Association for Computing Machinery Digital Library, and arXiv.
We used predefined criteria to select relevant articles and summarized them according to key components of ML analysis such as data characteristics, computational algorithms, and research focus.
arXiv Detail & Related papers (2021-08-05T04:38:47Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - Artificial Intelligence, speech and language processing approaches to
monitoring Alzheimer's Disease: a systematic review [5.635607414700482]
This paper summarises current findings on the use of artificial intelligence, speech and language processing to predict cognitive decline in Alzheimer's Disease.
We conducted a systematic review of original research between 2000 and 2019 registered in PROSPERO.
arXiv Detail & Related papers (2020-10-12T21:43:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.