Scalable Deep Learning for RNA Secondary Structure Prediction
- URL: http://arxiv.org/abs/2307.10073v1
- Date: Fri, 14 Jul 2023 12:54:56 GMT
- Title: Scalable Deep Learning for RNA Secondary Structure Prediction
- Authors: J\"org K.H. Franke, Frederic Runge, Frank Hutter
- Abstract summary: We present the RNAformer, a lean deep learning model using axial attention and recycling in the latent space.
Our approach achieves state-of-the-art performance on the popular TS0 benchmark dataset.
We show experimentally that the RNAformer can learn a biophysical model of the RNA folding process.
- Score: 38.46798525594529
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The field of RNA secondary structure prediction has made significant progress
with the adoption of deep learning techniques. In this work, we present the
RNAformer, a lean deep learning model using axial attention and recycling in
the latent space. We gain performance improvements by designing the
architecture for modeling the adjacency matrix directly in the latent space and
by scaling the size of the model. Our approach achieves state-of-the-art
performance on the popular TS0 benchmark dataset and even outperforms methods
that use external information. Further, we show experimentally that the
RNAformer can learn a biophysical model of the RNA folding process.
Related papers
- Comprehensive benchmarking of large language models for RNA secondary structure prediction [0.0]
RNA-LLM uses large datasets of RNA sequences to learn, in a self-supervised way, how to represent each RNA base with a semantically rich numerical vector.
Among them, predicting the secondary structure is a fundamental task for uncovering RNA functional mechanisms.
We present a comprehensive experimental analysis of several pre-trained RNA-LLM, comparing them for the RNA secondary structure prediction task in a unified deep learning framework.
arXiv Detail & Related papers (2024-10-21T17:12:06Z) - Beyond Sequence: Impact of Geometric Context for RNA Property Prediction [6.559586725997741]
RNA structures can be represented as 1D sequences, 2D topological graphs, or 3D all-atom models.
Existing works predominantly focus on 1D sequence-based models, which overlook the geometric context provided by 2D and 3D geometries.
This study presents the first systematic evaluation of incorporating explicit 2D and 3D geometric information into RNA property prediction.
arXiv Detail & Related papers (2024-10-15T17:09:34Z) - BEACON: Benchmark for Comprehensive RNA Tasks and Language Models [60.02663015002029]
We introduce the first comprehensive RNA benchmark BEACON (textbfBEnchmtextbfArk for textbfCOmprehensive RtextbfNA Task and Language Models).
First, BEACON comprises 13 distinct tasks derived from extensive previous work covering structural analysis, functional studies, and engineering applications.
Second, we examine a range of models, including traditional approaches like CNNs, as well as advanced RNA foundation models based on language models, offering valuable insights into the task-specific performances of these models.
Third, we investigate the vital RNA language model components
arXiv Detail & Related papers (2024-06-14T19:39:19Z) - Splicing Up Your Predictions with RNA Contrastive Learning [4.35360799431127]
We extend contrastive learning techniques to genomic data by utilizing similarities between functional sequences generated through alternative splicing gene duplication.
We validate their utility on downstream tasks such as RNA half-life and mean ribosome load prediction.
Our exploration of the learned latent space reveals that our contrastive objective yields semantically meaningful representations.
arXiv Detail & Related papers (2023-10-12T21:51:25Z) - RDesign: Hierarchical Data-efficient Representation Learning for
Tertiary Structure-based RNA Design [65.41144149958208]
This study aims to systematically construct a data-driven RNA design pipeline.
We crafted a benchmark dataset and designed a comprehensive structural modeling approach to represent the complex RNA tertiary structure.
We incorporated extracted secondary structures with base pairs as prior knowledge to facilitate the RNA design process.
arXiv Detail & Related papers (2023-01-25T17:19:49Z) - E2Efold-3D: End-to-End Deep Learning Method for accurate de novo RNA 3D
Structure Prediction [46.38735421190187]
We develop the first end-to-end deep learning approach, E2Efold-3D, to accurately perform the textitde novo RNA structure prediction.
Several novel components are proposed to overcome the data scarcity, such as a fully-differentiable end-to-end pipeline, secondary structure-assisted self-distillation, and parameter-efficient backbone formulation.
arXiv Detail & Related papers (2022-07-04T17:15:35Z) - Improving RNA Secondary Structure Design using Deep Reinforcement
Learning [69.63971634605797]
We propose a new benchmark of applying reinforcement learning to RNA sequence design, in which the objective function is defined to be the free energy in the sequence's secondary structure.
We show results of the ablation analysis that we do for these algorithms, as well as graphs indicating the algorithm's performance across batches.
arXiv Detail & Related papers (2021-11-05T02:54:06Z) - Review of Machine-Learning Methods for RNA Secondary Structure
Prediction [21.3539253580504]
We provide a comprehensive overview of RNA secondary structure prediction methods based on machine-learning technologies.
The current pending issues in the field of RNA secondary structure prediction and future trends are also discussed.
arXiv Detail & Related papers (2020-09-01T03:17:15Z) - RNA Secondary Structure Prediction By Learning Unrolled Algorithms [70.09461537906319]
In this paper, we propose an end-to-end deep learning model, called E2Efold, for RNA secondary structure prediction.
The key idea of E2Efold is to directly predict the RNA base-pairing matrix, and use an unrolled algorithm for constrained programming as the template for deep architectures to enforce constraints.
With comprehensive experiments on benchmark datasets, we demonstrate the superior performance of E2Efold.
arXiv Detail & Related papers (2020-02-13T23:21:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.