MuCoMiD: A Multitask Convolutional Learning Framework for miRNA-Disease
Association Prediction
- URL: http://arxiv.org/abs/2108.04820v1
- Date: Sun, 8 Aug 2021 10:01:46 GMT
- Title: MuCoMiD: A Multitask Convolutional Learning Framework for miRNA-Disease
Association Prediction
- Authors: Thi Ngan Dong and Megha Khosla
- Abstract summary: We propose a novel multi-tasking convolution-based approach, which we refer to as MuCoMiD.
MuCoMiD allows automatic feature extraction while incorporating knowledge from 4 heterogeneous biological information sources.
We construct large-scale experiments on standard benchmark datasets as well as our proposed larger independent test sets and case studies.
MuCoMiD shows an improvement of at least 5% in 5-fold CV evaluation on HMDDv2.0 and HMDDv3.0 datasets and at least 49% on larger independent test sets with unseen diseases and unseen diseases over state-of-the-art approaches.
- Score: 0.4061135251278187
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Growing evidence from recent studies implies that microRNA or miRNA could
serve as biomarkers in various complex human diseases. Since wet-lab
experiments are expensive and time-consuming, computational techniques for
miRNA-disease association prediction have attracted a lot of attention in
recent years. Data scarcity is one of the major challenges in building reliable
machine learning models. Data scarcity combined with the use of pre-calculated
hand-crafted input features has led to problems of overfitting and data
leakage.
We overcome the limitations of existing works by proposing a novel
multi-tasking convolution-based approach, which we refer to as MuCoMiD. MuCoMiD
allows automatic feature extraction while incorporating knowledge from 4
heterogeneous biological information sources (interactions between
miRNA/diseases and protein-coding genes (PCG), miRNA family information, and
disease ontology) in a multi-task setting which is a novel perspective and has
not been studied before. The use of multi-channel convolutions allows us to
extract expressive representations while keeping the model linear and,
therefore, simple. To effectively test the generalization capability of our
model, we construct large-scale experiments on standard benchmark datasets as
well as our proposed larger independent test sets and case studies. MuCoMiD
shows an improvement of at least 5% in 5-fold CV evaluation on HMDDv2.0 and
HMDDv3.0 datasets and at least 49% on larger independent test sets with unseen
miRNA and diseases over state-of-the-art approaches. We share our code for
reproducibility and future research at
https://git.l3s.uni-hannover.de/dong/cmtt.
Related papers
- Character-level Tokenizations as Powerful Inductive Biases for RNA Foundational Models [0.0]
understanding and predicting RNA behavior is a challenge due to the complexity of RNA structures and interactions.
Current RNA models have yet to match the performance observed in the protein domain.
ChaRNABERT is able to reach state-of-the-art performance on several tasks in established benchmarks.
arXiv Detail & Related papers (2024-11-05T21:56:16Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - DOCTOR: A Multi-Disease Detection Continual Learning Framework Based on Wearable Medical Sensors [3.088223994180069]
We propose DOCTOR, a multi-disease detection continual learning framework based on wearable medical sensors (WMSs)
It employs a multi-headed deep neural network (DNN) and a replay-style CL algorithm.
It achieves 1.43 times better average test accuracy, 1.25 times better F1-score, and 0.41 higher backward transfer than the naive fine-tuning framework.
arXiv Detail & Related papers (2023-05-09T19:33:17Z) - Drug Synergistic Combinations Predictions via Large-Scale Pre-Training
and Graph Structure Learning [82.93806087715507]
Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation.
Deep learning models have emerged as an efficient way to discover synergistic combinations.
Our framework achieves state-of-the-art results in comparison with other deep learning-based methods.
arXiv Detail & Related papers (2023-01-14T15:07:43Z) - Coupling Deep Imputation with Multitask Learning for Downstream Tasks on
Genomics Data [0.0]
In this paper we investigate how imputing data with missing values using deep learning and multitask learning can help to reach state-of-the-art performance results.
We propose a generalised deep imputation method to impute values where a patient has all modalities present except one.
In contrast, when using all modalities for survival prediction we observe that multitask learning alone outperforms deep imputation alone with statistical significance.
arXiv Detail & Related papers (2022-04-28T09:48:15Z) - Deep neural networks approach to microbial colony detection -- a
comparative analysis [52.77024349608834]
This study investigates the performance of three deep learning approaches for object detection on the AGAR dataset.
The achieved results may serve as a benchmark for future experiments.
arXiv Detail & Related papers (2021-08-23T12:06:00Z) - Deep Learning in current Neuroimaging: a multivariate approach with
power and type I error control but arguable generalization ability [0.158310730488265]
A non-parametric framework is proposed that estimates the statistical significance of classifications using deep learning architectures.
A label permutation test is proposed in both studies using cross-validation (CV) and resubstitution with upper bound correction (RUB) as validation methods.
We found in the permutation test that CV and RUB methods offer a false positive rate close to the significance level and an acceptable statistical power.
arXiv Detail & Related papers (2021-03-30T21:15:39Z) - Federated Deep AUC Maximization for Heterogeneous Data with a Constant
Communication Complexity [77.78624443410216]
We propose improved FDAM algorithms for detecting heterogeneous chest data.
A result of this paper is that the communication of the proposed algorithm is strongly independent of the number of machines and also independent of the accuracy level.
Experiments have demonstrated the effectiveness of our FDAM algorithm on benchmark datasets and on medical chest Xray images from different organizations.
arXiv Detail & Related papers (2021-02-09T04:05:19Z) - Graph Convolution Networks Using Message Passing and Multi-Source
Similarity Features for Predicting circRNA-Disease Association [5.423563861462909]
We propose a graph convolution network framework to learn features from a graph built with multi-source similarity information to predict circRNA-disease associations.
Proposed framework with five-fold cross validation on various experiments shows promising results in predicting circRNA-disease association.
arXiv Detail & Related papers (2020-09-15T15:22:42Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.