A robust kernel machine regression towards biomarker selection in
multi-omics datasets of osteoporosis for drug discovery
- URL: http://arxiv.org/abs/2201.05060v1
- Date: Thu, 13 Jan 2022 16:39:46 GMT
- Title: A robust kernel machine regression towards biomarker selection in
multi-omics datasets of osteoporosis for drug discovery
- Authors: Md Ashad Alam and Hui Shen and Hong-Wen Deng
- Abstract summary: We propose "robust kernel machine regression (RobMR)," to improve the robustness of statistical machine regression and the diversity of fictional data.
Experiments demonstrate that the proposed approach effectively identifies the inter-related risk factors of osteoporosis.
The proposed approach can be applied be to any disease model multi-omics datasets are available.
- Score: 2.2897244874280043
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many statistical machine approaches could ultimately highlight novel features
of the etiology of complex diseases by analyzing multi-omics data. However,
they are sensitive to some deviations in distribution when the observed samples
are potentially contaminated with adversarial corrupted outliers (e.g., a
fictional data distribution). Likewise, statistical advances lag in supporting
comprehensive data-driven analyses of complex multi-omics data integration. We
propose a novel non-linear M-estimator-based approach, "robust kernel machine
regression (RobKMR)," to improve the robustness of statistical machine
regression and the diversity of fictional data to examine the higher-order
composite effect of multi-omics datasets. We address a robust kernel-centered
Gram matrix to estimate the model parameters accurately. We also propose a
robust score test to assess the marginal and joint Hadamard product of features
from multi-omics data. We apply our proposed approach to a multi-omics dataset
of osteoporosis (OP) from Caucasian females. Experiments demonstrate that the
proposed approach effectively identifies the inter-related risk factors of OP.
With solid evidence (p-value = 0.00001), biological validations, network-based
analysis, causal inference, and drug repurposing, the selected three triplets
((DKK1, SMTN, DRGX), (MTND5, FASTKD2, CSMD3), (MTND5, COG3, CSMD3)) are
significant biomarkers and directly relate to BMD. Overall, the top three
selected genes (DKK1, MTND5, FASTKD2) and one gene (SIDT1 at p-value= 0.001)
significantly bond with four drugs- Tacrolimus, Ibandronate, Alendronate, and
Bazedoxifene out of 30 candidates for drug repurposing in OP. Further, the
proposed approach can be applied to any disease model where multi-omics
datasets are available.
Related papers
- Self-Normalizing Foundation Model for Enhanced Multi-Omics Data Analysis in Oncology [0.0]
SeNMo is a foundation model that has been trained on multi-omics data across 33 cancer types.
We trained SeNMo for the task of overall survival of patients using pan-cancer multi-omics data involving 33 cancer sites.
SeNMo was validated on two independent cohorts: Moffitt Cancer Center and CPTAC lung squamous cell carcinoma.
arXiv Detail & Related papers (2024-05-13T22:45:44Z) - DF-DM: A foundational process model for multimodal data fusion in the artificial intelligence era [3.2549142515720044]
This paper introduces a new process model for multimodal Data Fusion for Data Mining.
Our model aims to decrease computational costs, complexity, and bias while improving efficiency and reliability.
We demonstrate its efficacy through three use cases: predicting diabetic retinopathy using retinal images and patient metadata, domestic violence prediction employing satellite imagery, internet, and census data, and identifying clinical and demographic features from radiography images and clinical notes.
arXiv Detail & Related papers (2024-04-18T15:52:42Z) - XAI for In-hospital Mortality Prediction via Multimodal ICU Data [57.73357047856416]
We propose an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data.
We employ multimodal learning in our framework, which can receive heterogeneous inputs from clinical data and make decisions.
Our framework can be easily transferred to other clinical tasks, which facilitates the discovery of crucial factors in healthcare research.
arXiv Detail & Related papers (2023-12-29T14:28:04Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Source-Free Collaborative Domain Adaptation via Multi-Perspective
Feature Enrichment for Functional MRI Analysis [55.03872260158717]
Resting-state MRI functional (rs-fMRI) is increasingly employed in multi-site research to aid neurological disorder analysis.
Many methods have been proposed to reduce fMRI heterogeneity between source and target domains.
But acquiring source data is challenging due to concerns and/or data storage burdens in multi-site studies.
We design a source-free collaborative domain adaptation framework for fMRI analysis, where only a pretrained source model and unlabeled target data are accessible.
arXiv Detail & Related papers (2023-08-24T01:30:18Z) - Quantifying & Modeling Multimodal Interactions: An Information
Decomposition Framework [89.8609061423685]
We propose an information-theoretic approach to quantify the degree of redundancy, uniqueness, and synergy relating input modalities with an output task.
To validate PID estimation, we conduct extensive experiments on both synthetic datasets where the PID is known and on large-scale multimodal benchmarks.
We demonstrate their usefulness in (1) quantifying interactions within multimodal datasets, (2) quantifying interactions captured by multimodal models, (3) principled approaches for model selection, and (4) three real-world case studies.
arXiv Detail & Related papers (2023-02-23T18:59:05Z) - Functional Integrative Bayesian Analysis of High-dimensional
Multiplatform Genomic Data [0.8029049649310213]
We propose a framework called Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Genomic Data (fiBAG)
fiBAG allows simultaneous identification of upstream functional evidence of proteogenomic biomarkers.
We demonstrate the profitability of fiBAG via a pan-cancer analysis of 14 cancer types.
arXiv Detail & Related papers (2022-12-29T03:31:45Z) - Robust Hierarchical Patterns for identifying MDD patients: A Multisite
Study [3.4561220135252264]
We look at hierarchical Sparse Connectivity Patterns (h SCPs) as biomarkers for major depressive disorder (MDD)
We propose a novel model based on h SCPs to predict MDD patients from functional connectivity matrices extracted from resting-state fMRI data.
Our results show the impact of diversity on prediction performance. Our model can reduce diversity and improve the predictive and generalizing capability of the components.
arXiv Detail & Related papers (2022-02-22T19:40:32Z) - DIVERSE: bayesian Data IntegratiVE learning for precise drug ResponSE
prediction [27.531532648298768]
DIVERSE is a framework to predict drug responses from data of cell lines, drugs, and gene interactions.
It integrates data sources systematically, in a step-wise manner, examining the importance of each added data set in turn.
It clearly outperformed five other methods including three state-of-the-art approaches.
arXiv Detail & Related papers (2021-03-31T12:40:00Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Two-step penalised logistic regression for multi-omic data with an
application to cardiometabolic syndrome [62.997667081978825]
We implement a two-step approach to multi-omic logistic regression in which variable selection is performed on each layer separately.
Our approach should be preferred if the goal is to select as many relevant predictors as possible.
Our proposed approach allows us to identify features that characterise cardiometabolic syndrome at the molecular level.
arXiv Detail & Related papers (2020-08-01T10:36:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.