Related papers: Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Genomic Data

Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Genomic Data

URL: http://arxiv.org/abs/2212.14165v1
Date: Thu, 29 Dec 2022 03:31:45 GMT
Title: Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Genomic Data
Authors: Rupam Bhattacharyya and Nicholas Henderson and Veerabhadran Baladandayuthapani
Abstract summary: We propose a framework called Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Genomic Data (fiBAG) fiBAG allows simultaneous identification of upstream functional evidence of proteogenomic biomarkers. We demonstrate the profitability of fiBAG via a pan-cancer analysis of 14 cancer types.
Score: 0.8029049649310213
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Rapid advancements in collection and dissemination of multi-platform molecular and genomics data has resulted in enormous opportunities to aggregate such data in order to understand, prevent, and treat human diseases. While significant improvements have been made in multi-omic data integration methods to discover biological markers and mechanisms underlying both prognosis and treatment, the precise cellular functions governing these complex mechanisms still need detailed and data-driven de-novo evaluations. We propose a framework called Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Genomic Data (fiBAG), that allows simultaneous identification of upstream functional evidence of proteogenomic biomarkers and the incorporation of such knowledge in Bayesian variable selection models to improve signal detection. fiBAG employs a conflation of Gaussian process models to quantify (possibly non-linear) functional evidence via Bayes factors, which are then mapped to a novel calibrated spike-and-slab prior, thus guiding selection and providing functional relevance to the associations with patient outcomes. Using simulations, we illustrate how integrative methods with functional calibration have higher power to detect disease related markers than non-integrative approaches. We demonstrate the profitability of fiBAG via a pan-cancer analysis of 14 cancer types to identify and assess the cellular mechanisms of proteogenomic markers associated with cancer stemness and patient survival.

Related papers

Graph Kolmogorov-Arnold Networks for Multi-Cancer Classification and Biomarker Identification, An Interpretable Multi-Omics Approach [36.92842246372894]
Multi-Omics Graph Kolmogorov-Arnold Network (MOGKAN) is a deep learning framework that utilizes messenger-RNA, micro-RNA sequences, and DNA methylation samples. By integrating multi-omics data with graph-based deep learning, our proposed approach demonstrates robust predictive performance and interpretability.
arXiv Detail & Related papers (2025-03-29T02:14:05Z)
GENERator: A Long-Context Generative Genomic Foundation Model [66.46537421135996]
We present GENERator, a generative genomic foundation model featuring a context length of 98k base pairs (bp) and 1.2B parameters. Trained on an expansive dataset comprising 386B bp of DNA, the GENERator demonstrates state-of-the-art performance across both established and newly proposed benchmarks. It also shows significant promise in sequence optimization, particularly through the prompt-responsive generation of enhancer sequences with specific activity profiles.
arXiv Detail & Related papers (2025-02-11T05:39:49Z)
LASSO-MOGAT: A Multi-Omics Graph Attention Framework for Cancer Classification [41.94295877935867]
This paper introduces LASSO-MOGAT, a graph-based deep learning framework that integrates messenger RNA, microRNA, and DNA methylation data to classify 31 cancer types.
arXiv Detail & Related papers (2024-08-30T16:26:04Z)
MMIL: A novel algorithm for disease associated cell type discovery [58.044870442206914]
Single-cell datasets often lack individual cell labels, making it challenging to identify cells associated with disease. We introduce Mixture Modeling for Multiple Learning Instance (MMIL), an expectation method that enables the training and calibration of cell-level classifiers.
arXiv Detail & Related papers (2024-06-12T15:22:56Z)
Unlocking the Power of Multi-institutional Data: Integrating and Harmonizing Genomic Data Across Institutions [3.5489676012585236]
We introduce the Bridge model to derive integrated features to preserve information beyond common genes. The model consistently excels in predicting patient survival across six cancer types in GENIE BPC data.
arXiv Detail & Related papers (2024-01-30T23:25:05Z)
Single-Cell Deep Clustering Method Assisted by Exogenous Gene Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells. During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation. This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z)
Causal machine learning for single-cell genomics [94.28105176231739]
We discuss the application of machine learning techniques to single-cell genomics and their challenges. We first present the model that underlies most of current causal approaches to single-cell biology. We then identify open problems in the application of causal approaches to single-cell data.
arXiv Detail & Related papers (2023-10-23T13:35:24Z)
Optimal transport for automatic alignment of untargeted metabolomic data [8.692678207022084]
We introduce GromovMatcher, a flexible and user-friendly algorithm that automatically combines LC-MS datasets using optimal transport. By capitalizing on feature intensity correlation structures, GromovMatcher delivers superior alignment accuracy and robustness. We show how GromovMatcher facilitates the search for biomarkers associated with lifestyle risk factors linked to several cancer types.
arXiv Detail & Related papers (2023-06-05T20:08:19Z)
Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation. GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z)
Data-Driven Logistic Regression Ensembles With Applications in Genomics [0.0]
We propose a new approach for dealing with high-dimensional binary classification problems that combines ideas from regularization and ensembling. We demonstrate the good performance of our method in terms of prediction accuracy and identification of key biomarkers using several medical datasets involving common diseases such as cancer, multiple sclerosis and psoriasis.
arXiv Detail & Related papers (2021-02-17T05:57:26Z)
Topological Data Analysis of copy number alterations in cancer [70.85487611525896]
We explore the potential to capture information contained in cancer genomic information using a novel topology-based approach. We find that this technique has the potential to extract meaningful low-dimensional representations in cancer somatic genetic data.
arXiv Detail & Related papers (2020-11-22T17:31:23Z)
Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients. We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks. Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
Interpretable multimodal fusion networks reveal mechanisms of brain cognition [26.954460880062506]
We develop an interpretable multimodal fusion model, gCAM-CCL, which can perform automated diagnosis and result interpretation simultaneously. We validate the gCAM-CCL model on a brain imaging-genetic study, and show gCAM-CCL's performed well for both classification and mechanism analysis.
arXiv Detail & Related papers (2020-06-16T18:52:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.