Bayesian Reconstruction and Differential Testing of Excised mRNA
- URL: http://arxiv.org/abs/2211.07105v1
- Date: Mon, 14 Nov 2022 04:46:33 GMT
- Title: Bayesian Reconstruction and Differential Testing of Excised mRNA
- Authors: Marjan Hosseini, Devin McConnell, Derek Aguiar
- Abstract summary: We develop the first probabilistic model that reconciles the transcript and local splicing perspectives.
We present a novel hierarchical Bayesian admixture model for the Reconstruction of Excised mRNA (BREM)
BREM interpolates between local splicing events and full-length transcripts and thus focuses only on SMEs that have high posterior probability.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Characterizing the differential excision of mRNA is critical for
understanding the functional complexity of a cell or tissue, from normal
developmental processes to disease pathogenesis. Most transcript reconstruction
methods infer full-length transcripts from high-throughput sequencing data.
However, this is a challenging task due to incomplete annotations and the
differential expression of transcripts across cell-types, tissues, and
experimental conditions. Several recent methods circumvent these difficulties
by considering local splicing events, but these methods lose transcript-level
splicing information and may conflate transcripts. We develop the first
probabilistic model that reconciles the transcript and local splicing
perspectives. First, we formalize the sequence of mRNA excisions (SME)
reconstruction problem, which aims to assemble variable-length sequences of
mRNA excisions from RNA-sequencing data. We then present a novel hierarchical
Bayesian admixture model for the Reconstruction of Excised mRNA (BREM). BREM
interpolates between local splicing events and full-length transcripts and thus
focuses only on SMEs that have high posterior probability. We develop posterior
inference algorithms based on Gibbs sampling and local search of independent
sets and characterize differential SME usage using generalized linear models
based on converged BREM model parameters. We show that BREM achieves higher F1
score for reconstruction tasks and improved accuracy and sensitivity in
differential splicing when compared with four state-of-the-art transcript and
local splicing methods on simulated data. Lastly, we evaluate BREM on both bulk
and scRNA sequencing data based on transcript reconstruction, novelty of
transcripts produced, model sensitivity to hyperparameters, and a functional
analysis of differentially expressed SMEs, demonstrating that BREM captures
relevant biological signal.
Related papers
- Multi-Source and Test-Time Domain Adaptation on Multivariate Signals using Spatio-Temporal Monge Alignment [59.75420353684495]
Machine learning applications on signals such as computer vision or biomedical data often face challenges due to the variability that exists across hardware devices or session recordings.
In this work, we propose Spatio-Temporal Monge Alignment (STMA) to mitigate these variabilities.
We show that STMA leads to significant and consistent performance gains between datasets acquired with very different settings.
arXiv Detail & Related papers (2024-07-19T13:33:38Z) - Leveraging the Mahalanobis Distance to enhance Unsupervised Brain MRI Anomaly Detection [35.46541584018842]
Unsupervised Anomaly Detection (UAD) methods rely on healthy data distributions to identify anomalies as outliers.
In brain MRI, a common approach is reconstruction-based UAD, where generative models reconstruct healthy brain MRIs, and anomalies are detected as deviations between input and reconstruction.
We construct multiple reconstructions with probabilistic diffusion models. We then analyze the resulting distribution of these reconstructions using the Mahalanobis distance to identify anomalies as outliers.
arXiv Detail & Related papers (2024-07-17T11:02:31Z) - Semantically Rich Local Dataset Generation for Explainable AI in Genomics [0.716879432974126]
Black box deep learning models trained on genomic sequences excel at predicting the outcomes of different gene regulatory mechanisms.
We propose using Genetic Programming to generate datasets by evolving perturbations in sequences that contribute to their semantic diversity.
arXiv Detail & Related papers (2024-07-03T10:31:30Z) - scRDiT: Generating single-cell RNA-seq data by diffusion transformers and accelerating sampling [9.013834280011293]
Single-cell RNA sequencing (scRNA-seq) is a groundbreaking technology extensively utilized in biological research.
Our study introduces a generative approach termed scRNA-seq Diffusion Transformer (scRDiT)
This method generates virtual scRNA-seq data by leveraging a real dataset.
arXiv Detail & Related papers (2024-04-09T09:25:16Z) - SMRD: SURE-based Robust MRI Reconstruction with Diffusion Models [76.43625653814911]
Diffusion models have gained popularity for accelerated MRI reconstruction due to their high sample quality.
They can effectively serve as rich data priors while incorporating the forward model flexibly at inference time.
We introduce SURE-based MRI Reconstruction with Diffusion models (SMRD) to enhance robustness during testing.
arXiv Detail & Related papers (2023-10-03T05:05:35Z) - DreaMR: Diffusion-driven Counterfactual Explanation for Functional MRI [0.0]
We introduce the first diffusion-driven counterfactual method, DreaMR, to enable fMRI interpretation with high specificity, plausibility and fidelity.
DreaMR performs diffusion-based resampling of an input fMRI sample to alter the decision of a downstream classifier, and then computes the minimal difference between the original and counterfactual samples for explanation.
Comprehensive experiments on neuroimaging datasets demonstrate the superior specificity, fidelity and efficiency of DreaMR in sample generation over state-of-the-art counterfactual methods for fMRI interpretation.
arXiv Detail & Related papers (2023-07-18T18:46:07Z) - Incorporating Prior Knowledge in Deep Learning Models via Pathway
Activity Autoencoders [5.950889585409067]
We propose a novel prior-knowledge-based deep auto-encoding framework, PAAE, for RNA-seq data in cancer.
We show that, despite having access to a smaller set of features, our PAAE and PAVAE models achieve better out-of-set reconstruction results compared to common methodologies.
arXiv Detail & Related papers (2023-06-09T11:12:55Z) - From Cloze to Comprehension: Retrofitting Pre-trained Masked Language
Model to Pre-trained Machine Reader [130.45769668885487]
Pre-trained Machine Reader (PMR) is a novel method for retrofitting masked language models (MLMs) to pre-trained machine reading comprehension (MRC) models without acquiring labeled data.
To build the proposed PMR, we constructed a large volume of general-purpose and high-quality MRC-style training data.
PMR has the potential to serve as a unified model for tackling various extraction and classification tasks in the MRC formulation.
arXiv Detail & Related papers (2022-12-09T10:21:56Z) - Reference-based Magnetic Resonance Image Reconstruction Using Texture
Transforme [86.6394254676369]
We propose a novel Texture Transformer Module (TTM) for accelerated MRI reconstruction.
We formulate the under-sampled data and reference data as queries and keys in a transformer.
The proposed TTM can be stacked on prior MRI reconstruction approaches to further improve their performance.
arXiv Detail & Related papers (2021-11-18T03:06:25Z) - Pre-training Co-evolutionary Protein Representation via A Pairwise
Masked Language Model [93.9943278892735]
Key problem in protein sequence representation learning is to capture the co-evolutionary information reflected by the inter-residue co-variation in the sequences.
We propose a novel method to capture this information directly by pre-training via a dedicated language model, i.e., Pairwise Masked Language Model (PMLM)
Our result shows that the proposed method can effectively capture the interresidue correlations and improves the performance of contact prediction by up to 9% compared to the baseline.
arXiv Detail & Related papers (2021-10-29T04:01:32Z) - Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks.
We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator.
To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.