Bayesian Reconstruction and Differential Testing of Excised mRNA
- URL: http://arxiv.org/abs/2211.07105v1
- Date: Mon, 14 Nov 2022 04:46:33 GMT
- Title: Bayesian Reconstruction and Differential Testing of Excised mRNA
- Authors: Marjan Hosseini, Devin McConnell, Derek Aguiar
- Abstract summary: We develop the first probabilistic model that reconciles the transcript and local splicing perspectives.
We present a novel hierarchical Bayesian admixture model for the Reconstruction of Excised mRNA (BREM)
BREM interpolates between local splicing events and full-length transcripts and thus focuses only on SMEs that have high posterior probability.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Characterizing the differential excision of mRNA is critical for
understanding the functional complexity of a cell or tissue, from normal
developmental processes to disease pathogenesis. Most transcript reconstruction
methods infer full-length transcripts from high-throughput sequencing data.
However, this is a challenging task due to incomplete annotations and the
differential expression of transcripts across cell-types, tissues, and
experimental conditions. Several recent methods circumvent these difficulties
by considering local splicing events, but these methods lose transcript-level
splicing information and may conflate transcripts. We develop the first
probabilistic model that reconciles the transcript and local splicing
perspectives. First, we formalize the sequence of mRNA excisions (SME)
reconstruction problem, which aims to assemble variable-length sequences of
mRNA excisions from RNA-sequencing data. We then present a novel hierarchical
Bayesian admixture model for the Reconstruction of Excised mRNA (BREM). BREM
interpolates between local splicing events and full-length transcripts and thus
focuses only on SMEs that have high posterior probability. We develop posterior
inference algorithms based on Gibbs sampling and local search of independent
sets and characterize differential SME usage using generalized linear models
based on converged BREM model parameters. We show that BREM achieves higher F1
score for reconstruction tasks and improved accuracy and sensitivity in
differential splicing when compared with four state-of-the-art transcript and
local splicing methods on simulated data. Lastly, we evaluate BREM on both bulk
and scRNA sequencing data based on transcript reconstruction, novelty of
transcripts produced, model sensitivity to hyperparameters, and a functional
analysis of differentially expressed SMEs, demonstrating that BREM captures
relevant biological signal.
Related papers
- scMamba: A Pre-Trained Model for Single-Nucleus RNA Sequencing Analysis in Neurodegenerative Disorders [43.24785083027205]
scMamba is a pre-trained model designed to improve the quality and utility of snRNA-seq analysis.
Inspired by the recent Mamba model, scMamba introduces a novel architecture that incorporates a linear adapter layer, gene embeddings, and bidirectional Mamba blocks.
We demonstrate that scMamba outperforms benchmark methods in various downstream tasks, including cell type annotation, doublet detection, imputation, and the identification of differentially expressed genes.
arXiv Detail & Related papers (2025-02-12T11:48:22Z) - scBIT: Integrating Single-cell Transcriptomic Data into fMRI-based Prediction for Alzheimer's Disease Diagnosis [24.268703526039367]
scBIT is a novel method for enhancing Alzheimer's disease (AD) prediction by combining fMRI with single-nucleus RNA (snRNA)
It employs a sampling strategy to segment snRNA data into cell-type-specific gene networks and utilizes a self-explainable graph neural network to extract critical subgraphs.
Extensive experiments validate scBIT's effectiveness in revealing intricate brain region-gene associations.
arXiv Detail & Related papers (2025-02-04T18:37:46Z) - ContextMRI: Enhancing Compressed Sensing MRI through Metadata Conditioning [51.26601171361753]
We propose ContextMRI, a text-conditioned diffusion model for MRI that integrates granular metadata into the reconstruction process.
We show that increasing the fidelity of metadata, ranging from slice location and contrast to patient age, sex, and pathology, systematically boosts reconstruction performance.
arXiv Detail & Related papers (2025-01-08T05:15:43Z) - Diffusion Model with Representation Alignment for Protein Inverse Folding [53.139837825588614]
Protein inverse folding is a fundamental problem in bioinformatics, aiming to recover the amino acid sequences from a given protein backbone structure.
We propose a novel method that leverages diffusion models with representation alignment (DMRA)
In experiments, we conduct extensive evaluations on the CATH4.2 dataset to demonstrate that DMRA outperforms leading methods.
arXiv Detail & Related papers (2024-12-12T15:47:59Z) - Comprehensive benchmarking of large language models for RNA secondary structure prediction [0.0]
RNA-LLM uses large datasets of RNA sequences to learn, in a self-supervised way, how to represent each RNA base with a semantically rich numerical vector.
Among them, predicting the secondary structure is a fundamental task for uncovering RNA functional mechanisms.
We present a comprehensive experimental analysis of several pre-trained RNA-LLM, comparing them for the RNA secondary structure prediction task in a unified deep learning framework.
arXiv Detail & Related papers (2024-10-21T17:12:06Z) - Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction [88.65168366064061]
We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference.
Our framework leads to a family of three novel objectives that are all simulation-free, and thus scalable.
We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.
arXiv Detail & Related papers (2024-10-10T17:18:30Z) - Multi-Source and Test-Time Domain Adaptation on Multivariate Signals using Spatio-Temporal Monge Alignment [59.75420353684495]
Machine learning applications on signals such as computer vision or biomedical data often face challenges due to the variability that exists across hardware devices or session recordings.
In this work, we propose Spatio-Temporal Monge Alignment (STMA) to mitigate these variabilities.
We show that STMA leads to significant and consistent performance gains between datasets acquired with very different settings.
arXiv Detail & Related papers (2024-07-19T13:33:38Z) - Leveraging the Mahalanobis Distance to enhance Unsupervised Brain MRI Anomaly Detection [35.46541584018842]
Unsupervised Anomaly Detection (UAD) methods rely on healthy data distributions to identify anomalies as outliers.
In brain MRI, a common approach is reconstruction-based UAD, where generative models reconstruct healthy brain MRIs, and anomalies are detected as deviations between input and reconstruction.
We construct multiple reconstructions with probabilistic diffusion models. We then analyze the resulting distribution of these reconstructions using the Mahalanobis distance to identify anomalies as outliers.
arXiv Detail & Related papers (2024-07-17T11:02:31Z) - Semantically Rich Local Dataset Generation for Explainable AI in Genomics [0.716879432974126]
Black box deep learning models trained on genomic sequences excel at predicting the outcomes of different gene regulatory mechanisms.
We propose using Genetic Programming to generate datasets by evolving perturbations in sequences that contribute to their semantic diversity.
arXiv Detail & Related papers (2024-07-03T10:31:30Z) - scRDiT: Generating single-cell RNA-seq data by diffusion transformers and accelerating sampling [9.013834280011293]
Single-cell RNA sequencing (scRNA-seq) is a groundbreaking technology extensively utilized in biological research.
Our study introduces a generative approach termed scRNA-seq Diffusion Transformer (scRDiT)
This method generates virtual scRNA-seq data by leveraging a real dataset.
arXiv Detail & Related papers (2024-04-09T09:25:16Z) - Guided Reconstruction with Conditioned Diffusion Models for Unsupervised Anomaly Detection in Brain MRIs [35.46541584018842]
Unsupervised Anomaly Detection (UAD) aims to identify any anomaly as an outlier from a healthy training distribution.
generative models are used to learn the reconstruction of healthy brain anatomy for a given input image.
We propose conditioning the denoising process of diffusion models with additional information derived from a latent representation of the input image.
arXiv Detail & Related papers (2023-12-07T11:03:42Z) - SMRD: SURE-based Robust MRI Reconstruction with Diffusion Models [76.43625653814911]
Diffusion models have gained popularity for accelerated MRI reconstruction due to their high sample quality.
They can effectively serve as rich data priors while incorporating the forward model flexibly at inference time.
We introduce SURE-based MRI Reconstruction with Diffusion models (SMRD) to enhance robustness during testing.
arXiv Detail & Related papers (2023-10-03T05:05:35Z) - DreaMR: Diffusion-driven Counterfactual Explanation for Functional MRI [0.0]
We introduce the first diffusion-driven counterfactual method, DreaMR, to enable fMRI interpretation with high specificity, plausibility and fidelity.
DreaMR performs diffusion-based resampling of an input fMRI sample to alter the decision of a downstream classifier, and then computes the minimal difference between the original and counterfactual samples for explanation.
Comprehensive experiments on neuroimaging datasets demonstrate the superior specificity, fidelity and efficiency of DreaMR in sample generation over state-of-the-art counterfactual methods for fMRI interpretation.
arXiv Detail & Related papers (2023-07-18T18:46:07Z) - From Cloze to Comprehension: Retrofitting Pre-trained Masked Language
Model to Pre-trained Machine Reader [130.45769668885487]
Pre-trained Machine Reader (PMR) is a novel method for retrofitting masked language models (MLMs) to pre-trained machine reading comprehension (MRC) models without acquiring labeled data.
To build the proposed PMR, we constructed a large volume of general-purpose and high-quality MRC-style training data.
PMR has the potential to serve as a unified model for tackling various extraction and classification tasks in the MRC formulation.
arXiv Detail & Related papers (2022-12-09T10:21:56Z) - Reference-based Magnetic Resonance Image Reconstruction Using Texture
Transforme [86.6394254676369]
We propose a novel Texture Transformer Module (TTM) for accelerated MRI reconstruction.
We formulate the under-sampled data and reference data as queries and keys in a transformer.
The proposed TTM can be stacked on prior MRI reconstruction approaches to further improve their performance.
arXiv Detail & Related papers (2021-11-18T03:06:25Z) - Pre-training Co-evolutionary Protein Representation via A Pairwise
Masked Language Model [93.9943278892735]
Key problem in protein sequence representation learning is to capture the co-evolutionary information reflected by the inter-residue co-variation in the sequences.
We propose a novel method to capture this information directly by pre-training via a dedicated language model, i.e., Pairwise Masked Language Model (PMLM)
Our result shows that the proposed method can effectively capture the interresidue correlations and improves the performance of contact prediction by up to 9% compared to the baseline.
arXiv Detail & Related papers (2021-10-29T04:01:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.