Description Generation using Variational Auto-Encoders for precursor
microRNA
- URL: http://arxiv.org/abs/2311.17970v1
- Date: Wed, 29 Nov 2023 15:41:45 GMT
- Title: Description Generation using Variational Auto-Encoders for precursor
microRNA
- Authors: Marko Petkovi\'c, Vlado Menkovski
- Abstract summary: We propose a novel framework, which makes use of generative modeling through Vari Auto-Encoders to uncover latent factors of pre-miRNA.
Applying the framework to classification, we obtain a high reconstruction and classification performance, while also developing an accurate description.
- Score: 5.6710852973206105
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Micro RNAs (miRNA) are a type of non-coding RNA, which are involved in gene
regulation and can be associated with diseases such as cancer, cardiovascular
and neurological diseases. As such, identifying the entire genome of miRNA can
be of great relevance. Since experimental methods for novel precursor miRNA
(pre-miRNA) detection are complex and expensive, computational detection using
ML could be useful. Existing ML methods are often complex black boxes, which do
not create an interpretable structural description of pre-miRNA. In this paper,
we propose a novel framework, which makes use of generative modeling through
Variational Auto-Encoders to uncover the generative factors of pre-miRNA. After
training the VAE, the pre-miRNA description is developed using a decision tree
on the lower dimensional latent space. Applying the framework to miRNA
classification, we obtain a high reconstruction and classification performance,
while also developing an accurate miRNA description.
Related papers
- BEACON: Benchmark for Comprehensive RNA Tasks and Language Models [60.02663015002029]
We introduce the first comprehensive RNA benchmark BEACON (textbfBEnchmtextbfArk for textbfCOmprehensive RtextbfNA Task and Language Models).
First, BEACON comprises 13 distinct tasks derived from extensive previous work covering structural analysis, functional studies, and engineering applications.
Second, we examine a range of models, including traditional approaches like CNNs, as well as advanced RNA foundation models based on language models, offering valuable insights into the task-specific performances of these models.
Third, we investigate the vital RNA language model components
arXiv Detail & Related papers (2024-06-14T19:39:19Z) - RNAFlow: RNA Structure & Sequence Design via Inverse Folding-Based Flow Matching [7.600990806121113]
RNAFlow is a flow matching model for protein-conditioned RNA sequence-structure design.
Its denoising network integrates an RNA inverse folding model and a pre-trained RosettaFold2NA network for generation of RNA sequences and structures.
arXiv Detail & Related papers (2024-05-29T05:10:25Z) - RiNALMo: General-Purpose RNA Language Models Can Generalize Well on
Structure Prediction Tasks [1.2466379414976048]
We introduce RiboNucleic Acid Language Model (RiNALMo) to help unveil the hidden code of RNA.
RiNALMo is the largest RNA language model to date with $650$ million parameters pre-trained on $36$ million non-coding RNA sequences.
arXiv Detail & Related papers (2024-02-29T14:50:58Z) - scHyena: Foundation Model for Full-Length Single-Cell RNA-Seq Analysis
in Brain [46.39828178736219]
We introduce scHyena, a foundation model designed to address these challenges and enhance the accuracy of scRNA-seq analysis in the brain.
scHyena is equipped with a linear adaptor layer, the positional encoding via gene-embedding, and a bidirectional Hyena operator.
This enables us to process full-length scRNA-seq data without losing any information from the raw data.
arXiv Detail & Related papers (2023-10-04T10:30:08Z) - Knowledge from Large-Scale Protein Contact Prediction Models Can Be
Transferred to the Data-Scarce RNA Contact Prediction Task [40.051834115537474]
We find that a protein-coevolution Transformer-based deep neural network can be transferred to the RNA contact prediction task.
Experiments confirm that RNA contact prediction through transfer learning is greatly improved.
Our findings indicate that the learned structural patterns of proteins can be transferred to RNAs, opening up potential new avenues for research.
arXiv Detail & Related papers (2023-02-13T06:00:56Z) - RDesign: Hierarchical Data-efficient Representation Learning for
Tertiary Structure-based RNA Design [65.41144149958208]
This study aims to systematically construct a data-driven RNA design pipeline.
We crafted a benchmark dataset and designed a comprehensive structural modeling approach to represent the complex RNA tertiary structure.
We incorporated extracted secondary structures with base pairs as prior knowledge to facilitate the RNA design process.
arXiv Detail & Related papers (2023-01-25T17:19:49Z) - E2Efold-3D: End-to-End Deep Learning Method for accurate de novo RNA 3D
Structure Prediction [46.38735421190187]
We develop the first end-to-end deep learning approach, E2Efold-3D, to accurately perform the textitde novo RNA structure prediction.
Several novel components are proposed to overcome the data scarcity, such as a fully-differentiable end-to-end pipeline, secondary structure-assisted self-distillation, and parameter-efficient backbone formulation.
arXiv Detail & Related papers (2022-07-04T17:15:35Z) - Machine learning for plant microRNA prediction: A systematic review [0.0]
MicroRNAs (miRNAs) are endogenous small non-coding RNAs that play an important role in gene regulation.
computational and machine learning-based approaches have been adopted to predict microRNAs.
This systematic review focuses on the machine learning methods developed for identification in plants.
arXiv Detail & Related papers (2021-06-29T08:22:57Z) - Visualizing hierarchies in scRNA-seq data using a density tree-biased
autoencoder [50.591267188664666]
We propose an approach for identifying a meaningful tree structure from high-dimensional scRNA-seq data.
We then introduce DTAE, a tree-biased autoencoder that emphasizes the tree structure of the data in low dimensional space.
arXiv Detail & Related papers (2021-02-11T08:48:48Z) - Classification of Long Noncoding RNA Elements Using Deep Convolutional
Neural Networks and Siamese Networks [17.8181080354116]
This thesis proposes a new methodemploying deep convolutional neural networks (CNNs) to classifyncRNA sequences.
As a result, clas-sifying RNA sequences is converted to an image classificationproblem that can be efficiently solved by CNN-basedclassification models.
arXiv Detail & Related papers (2021-02-10T17:26:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.