Related papers: MuCoMiD: A Multitask Convolutional Learning Framework for miRNA-Disease Association Prediction

MuCoMiD: A Multitask Convolutional Learning Framework for miRNA-Disease Association Prediction

URL: http://arxiv.org/abs/2108.04820v1
Date: Sun, 8 Aug 2021 10:01:46 GMT
Title: MuCoMiD: A Multitask Convolutional Learning Framework for miRNA-Disease Association Prediction
Authors: Thi Ngan Dong and Megha Khosla
Abstract summary: We propose a novel multi-tasking convolution-based approach, which we refer to as MuCoMiD. MuCoMiD allows automatic feature extraction while incorporating knowledge from 4 heterogeneous biological information sources. We construct large-scale experiments on standard benchmark datasets as well as our proposed larger independent test sets and case studies. MuCoMiD shows an improvement of at least 5% in 5-fold CV evaluation on HMDDv2.0 and HMDDv3.0 datasets and at least 49% on larger independent test sets with unseen diseases and unseen diseases over state-of-the-art approaches.
Score: 0.4061135251278187
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Growing evidence from recent studies implies that microRNA or miRNA could serve as biomarkers in various complex human diseases. Since wet-lab experiments are expensive and time-consuming, computational techniques for miRNA-disease association prediction have attracted a lot of attention in recent years. Data scarcity is one of the major challenges in building reliable machine learning models. Data scarcity combined with the use of pre-calculated hand-crafted input features has led to problems of overfitting and data leakage. We overcome the limitations of existing works by proposing a novel multi-tasking convolution-based approach, which we refer to as MuCoMiD. MuCoMiD allows automatic feature extraction while incorporating knowledge from 4 heterogeneous biological information sources (interactions between miRNA/diseases and protein-coding genes (PCG), miRNA family information, and disease ontology) in a multi-task setting which is a novel perspective and has not been studied before. The use of multi-channel convolutions allows us to extract expressive representations while keeping the model linear and, therefore, simple. To effectively test the generalization capability of our model, we construct large-scale experiments on standard benchmark datasets as well as our proposed larger independent test sets and case studies. MuCoMiD shows an improvement of at least 5% in 5-fold CV evaluation on HMDDv2.0 and HMDDv3.0 datasets and at least 49% on larger independent test sets with unseen miRNA and diseases over state-of-the-art approaches. We share our code for reproducibility and future research at https://git.l3s.uni-hannover.de/dong/cmtt.

Related papers

GBDTSVM: Combined Support Vector Machine and Gradient Boosting Decision Tree Framework for efficient snoRNA-disease association prediction [0.0]
This paper proposes a model called 'GBDTSVM', representing a novel and efficient machine learning approach for predicting snoRNA-disease associations.<n>'GBDTSVM' effectively extracts integrated snoRNA-disease feature representations utilizing GBDT and SVM.<n> Experimental evaluation of the GBDTSVM model demonstrated superior performance compared to state-of-the-art methods in the field.
arXiv Detail & Related papers (2025-05-10T06:46:29Z)
scMamba: A Pre-Trained Model for Single-Nucleus RNA Sequencing Analysis in Neurodegenerative Disorders [43.24785083027205]
scMamba is a pre-trained model designed to improve the quality and utility of snRNA-seq analysis. Inspired by the recent Mamba model, scMamba introduces a novel architecture that incorporates a linear adapter layer, gene embeddings, and bidirectional Mamba blocks. We demonstrate that scMamba outperforms benchmark methods in various downstream tasks, including cell type annotation, doublet detection, imputation, and the identification of differentially expressed genes.
arXiv Detail & Related papers (2025-02-12T11:48:22Z)
scBIT: Integrating Single-cell Transcriptomic Data into fMRI-based Prediction for Alzheimer's Disease Diagnosis [24.268703526039367]
scBIT is a novel method for enhancing Alzheimer's disease (AD) prediction by combining fMRI with single-nucleus RNA (snRNA) It employs a sampling strategy to segment snRNA data into cell-type-specific gene networks and utilizes a self-explainable graph neural network to extract critical subgraphs. Extensive experiments validate scBIT's effectiveness in revealing intricate brain region-gene associations.
arXiv Detail & Related papers (2025-02-04T18:37:46Z)
Character-level Tokenizations as Powerful Inductive Biases for RNA Foundational Models [0.0]
understanding and predicting RNA behavior is a challenge due to the complexity of RNA structures and interactions. Current RNA models have yet to match the performance observed in the protein domain. ChaRNABERT is able to reach state-of-the-art performance on several tasks in established benchmarks.
arXiv Detail & Related papers (2024-11-05T21:56:16Z)
The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation. We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare. Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z)
DOCTOR: A Multi-Disease Detection Continual Learning Framework Based on Wearable Medical Sensors [3.088223994180069]
We propose DOCTOR, a multi-disease detection continual learning framework based on wearable medical sensors (WMSs) It employs a multi-headed deep neural network (DNN) and a replay-style CL algorithm. It achieves 1.43 times better average test accuracy, 1.25 times better F1-score, and 0.41 higher backward transfer than the naive fine-tuning framework.
arXiv Detail & Related papers (2023-05-09T19:33:17Z)
Drug Synergistic Combinations Predictions via Large-Scale Pre-Training and Graph Structure Learning [82.93806087715507]
Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation. Deep learning models have emerged as an efficient way to discover synergistic combinations. Our framework achieves state-of-the-art results in comparison with other deep learning-based methods.
arXiv Detail & Related papers (2023-01-14T15:07:43Z)
Coupling Deep Imputation with Multitask Learning for Downstream Tasks on Genomics Data [0.0]
In this paper we investigate how imputing data with missing values using deep learning and multitask learning can help to reach state-of-the-art performance results. We propose a generalised deep imputation method to impute values where a patient has all modalities present except one. In contrast, when using all modalities for survival prediction we observe that multitask learning alone outperforms deep imputation alone with statistical significance.
arXiv Detail & Related papers (2022-04-28T09:48:15Z)
Deep neural networks approach to microbial colony detection -- a comparative analysis [52.77024349608834]
This study investigates the performance of three deep learning approaches for object detection on the AGAR dataset. The achieved results may serve as a benchmark for future experiments.
arXiv Detail & Related papers (2021-08-23T12:06:00Z)
Deep Learning in current Neuroimaging: a multivariate approach with power and type I error control but arguable generalization ability [0.158310730488265]
A non-parametric framework is proposed that estimates the statistical significance of classifications using deep learning architectures. A label permutation test is proposed in both studies using cross-validation (CV) and resubstitution with upper bound correction (RUB) as validation methods. We found in the permutation test that CV and RUB methods offer a false positive rate close to the significance level and an acceptable statistical power.
arXiv Detail & Related papers (2021-03-30T21:15:39Z)
Federated Deep AUC Maximization for Heterogeneous Data with a Constant Communication Complexity [77.78624443410216]
We propose improved FDAM algorithms for detecting heterogeneous chest data. A result of this paper is that the communication of the proposed algorithm is strongly independent of the number of machines and also independent of the accuracy level. Experiments have demonstrated the effectiveness of our FDAM algorithm on benchmark datasets and on medical chest Xray images from different organizations.
arXiv Detail & Related papers (2021-02-09T04:05:19Z)
Graph Convolution Networks Using Message Passing and Multi-Source Similarity Features for Predicting circRNA-Disease Association [5.423563861462909]
We propose a graph convolution network framework to learn features from a graph built with multi-source similarity information to predict circRNA-disease associations. Proposed framework with five-fold cross validation on various experiments shows promising results in predicting circRNA-disease association.
arXiv Detail & Related papers (2020-09-15T15:22:42Z)
Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients. We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks. Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
A Systematic Approach to Featurization for Cancer Drug Sensitivity Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques. We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.