RiboGen: RNA Sequence and Structure Co-Generation with Equivariant MultiFlow
- URL: http://arxiv.org/abs/2503.02058v4
- Date: Fri, 18 Apr 2025 16:16:48 GMT
- Title: RiboGen: RNA Sequence and Structure Co-Generation with Equivariant MultiFlow
- Authors: Dana Rubin, Allan dos Santos Costa, Manvitha Ponnapati, Joseph Jacobson,
- Abstract summary: RiboGen is the first deep learning model to simultaneously generate RNA sequence and all-atom 3D structure.<n>Our experiments show that RiboGen can efficiently generate chemically plausible and self-consistent RNA samples.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ribonucleic acid (RNA) plays fundamental roles in biological systems, from carrying genetic information to performing enzymatic function. Understanding and designing RNA can enable novel therapeutic application and biotechnological innovation. To enhance RNA design, in this paper we introduce RiboGen, the first deep learning model to simultaneously generate RNA sequence and all-atom 3D structure. RiboGen leverages the standard Flow Matching with Discrete Flow Matching in a multimodal data representation. RiboGen is based on Euclidean Equivariant neural networks for efficiently processing and learning three-dimensional geometry. Our experiments show that RiboGen can efficiently generate chemically plausible and self-consistent RNA samples, suggesting that co-generation of sequence and structure is a competitive approach for modeling RNA.
Related papers
- BAnG: Bidirectional Anchored Generation for Conditional RNA Design [15.92155083519678]
RNA-BAnG is a deep learning-based model designed to generate RNA sequences for protein interactions without these requirements.
We first validate our method on generic synthetic tasks involving similar localized motifs to those appearing in RNAs.
We then evaluate our model on biological sequences, showing its effectiveness for conditional RNA sequence design given a binding protein.
arXiv Detail & Related papers (2025-02-28T17:51:00Z) - Life-Code: Central Dogma Modeling with Multi-Omics Sequence Unification [53.488387420073536]
Life-Code is a comprehensive framework that spans different biological functions.
Life-Code achieves state-of-the-art performance on various tasks across three omics.
arXiv Detail & Related papers (2025-02-11T06:53:59Z) - RNA-GPT: Multimodal Generative System for RNA Sequence Understanding [6.611255836269348]
RNAs are essential molecules that carry genetic information vital for life.
Despite this importance, RNA research is often hindered by the vast literature available on the topic.
We introduce RNA-GPT, a multi-modal RNA chat model designed to simplify RNA discovery.
arXiv Detail & Related papers (2024-10-29T06:19:56Z) - RNACG: A Universal RNA Sequence Conditional Generation model based on Flow-Matching [0.0]
We develop a universal RNA sequence generation model based on flow matching, namely RNACG.
RNACG can accommodate various conditional inputs and is portable, enabling users to customize the encoding network for conditional inputs.
RNACG exhibits extensive applicability in sequence generation and property prediction tasks.
arXiv Detail & Related papers (2024-07-29T09:46:46Z) - BEACON: Benchmark for Comprehensive RNA Tasks and Language Models [60.02663015002029]
We introduce the first comprehensive RNA benchmark BEACON (textbfBEnchmtextbfArk for textbfCOmprehensive RtextbfNA Task and Language Models).<n>First, BEACON comprises 13 distinct tasks derived from extensive previous work covering structural analysis, functional studies, and engineering applications.<n>Second, we examine a range of models, including traditional approaches like CNNs, as well as advanced RNA foundation models based on language models, offering valuable insights into the task-specific performances of these models.<n>Third, we investigate the vital RNA language model components
arXiv Detail & Related papers (2024-06-14T19:39:19Z) - RNAFlow: RNA Structure & Sequence Design via Inverse Folding-Based Flow Matching [7.600990806121113]
RNAFlow is a flow matching model for protein-conditioned RNA sequence-structure design.
Its denoising network integrates an RNA inverse folding model and a pre-trained RosettaFold2NA network for generation of RNA sequences and structures.
arXiv Detail & Related papers (2024-05-29T05:10:25Z) - scHyena: Foundation Model for Full-Length Single-Cell RNA-Seq Analysis
in Brain [46.39828178736219]
We introduce scHyena, a foundation model designed to address these challenges and enhance the accuracy of scRNA-seq analysis in the brain.
scHyena is equipped with a linear adaptor layer, the positional encoding via gene-embedding, and a bidirectional Hyena operator.
This enables us to process full-length scRNA-seq data without losing any information from the raw data.
arXiv Detail & Related papers (2023-10-04T10:30:08Z) - gRNAde: Geometric Deep Learning for 3D RNA inverse design [14.729049204432027]
gRNAde is a geometric RNA design pipeline operating on 3D RNA backbones.<n>It generates sequences that explicitly account for structure and dynamics.
arXiv Detail & Related papers (2023-05-24T05:46:56Z) - RDesign: Hierarchical Data-efficient Representation Learning for
Tertiary Structure-based RNA Design [65.41144149958208]
This study aims to systematically construct a data-driven RNA design pipeline.
We crafted a benchmark dataset and designed a comprehensive structural modeling approach to represent the complex RNA tertiary structure.
We incorporated extracted secondary structures with base pairs as prior knowledge to facilitate the RNA design process.
arXiv Detail & Related papers (2023-01-25T17:19:49Z) - Accurate RNA 3D structure prediction using a language model-based deep learning approach [50.193512039121984]
RhoFold+ is an RNA language model-based deep learning method that accurately predicts 3D structures of single-chain RNAs from sequences.<n>RhoFold+ offers a fully automated end-to-end pipeline for RNA 3D structure prediction.
arXiv Detail & Related papers (2022-07-04T17:15:35Z) - Improving RNA Secondary Structure Design using Deep Reinforcement
Learning [69.63971634605797]
We propose a new benchmark of applying reinforcement learning to RNA sequence design, in which the objective function is defined to be the free energy in the sequence's secondary structure.
We show results of the ablation analysis that we do for these algorithms, as well as graphs indicating the algorithm's performance across batches.
arXiv Detail & Related papers (2021-11-05T02:54:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.