Deep Learning Framework for RNA Inverse Folding with Geometric Structure Potentials
- URL: http://arxiv.org/abs/2601.00895v1
- Date: Wed, 31 Dec 2025 15:43:12 GMT
- Title: Deep Learning Framework for RNA Inverse Folding with Geometric Structure Potentials
- Authors: Annabelle Yao,
- Abstract summary: I introduce a deep learning framework that integrates Geometric Vector Perceptron layers with a Transformer architecture to enable end-to-end RNA design.<n>I construct a dataset consisting of experimentally solved RNA 3D structures, filtered and deduplicated from the BGSU RNA list, and evaluate performance using both sequence recovery rate and TM-score.<n>My model achieves state-of-the-art performance, with recovery and TM-scores of 0.481 and 0.332, surpassing existing methods across diverse RNA families and length scales.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: RNA's diverse biological functions stem from its structural versatility, yet accurately predicting and designing RNA sequences given a 3D conformation (inverse folding) remains a challenge. Here, I introduce a deep learning framework that integrates Geometric Vector Perceptron (GVP) layers with a Transformer architecture to enable end-to-end RNA design. I construct a dataset consisting of experimentally solved RNA 3D structures, filtered and deduplicated from the BGSU RNA list, and evaluate performance using both sequence recovery rate and TM-score to assess sequence and structural fidelity, respectively. On standard benchmarks and RNA-Puzzles, my model achieves state-of-the-art performance, with recovery and TM-scores of 0.481 and 0.332, surpassing existing methods across diverse RNA families and length scales. Masked family-level validation using Rfam annotations confirms strong generalization beyond seen families. Furthermore, inverse-folded sequences, when refolded using AlphaFold3, closely resemble native structures, highlighting the critical role of geometric features captured by GVP layers in enhancing Transformer-based RNA design.
Related papers
- RIDER: 3D RNA Inverse Design with Reinforcement Learning-Guided Diffusion [19.386628516684695]
RIDER is an RNA Inverse DEsign framework with Reinforcement learning that directly optimize for 3D structural similarity.<n>First, we develop and pre-train a GNN-based generative diffusion model conditioned on the target 3D structure.<n>Then, we fine-tune the model with an improved policy gradient algorithm using four task-specific reward functions based on 3D self-consistency metrics.
arXiv Detail & Related papers (2026-02-18T15:52:26Z) - Regulatory DNA sequence Design with Reinforcement Learning [56.20290878358356]
We propose a generative approach that leverages reinforcement learning to fine-tune a pre-trained autoregressive model.<n>We evaluate our method on promoter design tasks in two yeast media conditions and enhancer design tasks for three human cell types.
arXiv Detail & Related papers (2025-03-11T02:33:33Z) - RNACG: A Universal RNA Sequence Conditional Generation model based on Flow-Matching [0.0]
We propose RNACG (RNA Generator), a universal framework for RNA sequence design based on flow matching.<n>By unifying sequence generation under a single framework, RNACG enables the integration of multiple RNA design paradigms.
arXiv Detail & Related papers (2024-07-29T09:46:46Z) - Bridging Sequence-Structure Alignment in RNA Foundation Models [7.068604225076706]
The alignment between RNA sequences and structures in foundation models (FMs) has yet to be investigated.<n>Existing FMs have struggled to establish sequence-structure alignment, hindering the free flow of genomic information.<n>We introduce OmniGenome, an RNA FM trained to align RNA sequences with respect to secondary structures based on structure-contextualised modelling.
arXiv Detail & Related papers (2024-07-15T21:10:40Z) - BEACON: Benchmark for Comprehensive RNA Tasks and Language Models [60.02663015002029]
We introduce the first comprehensive RNA benchmark BEACON (textbfBEnchmtextbfArk for textbfCOmprehensive RtextbfNA Task and Language Models).<n>First, BEACON comprises 13 distinct tasks derived from extensive previous work covering structural analysis, functional studies, and engineering applications.<n>Second, we examine a range of models, including traditional approaches like CNNs, as well as advanced RNA foundation models based on language models, offering valuable insights into the task-specific performances of these models.<n>Third, we investigate the vital RNA language model components
arXiv Detail & Related papers (2024-06-14T19:39:19Z) - Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation [55.93511121486321]
We introduce FoldFlow-2, a novel sequence-conditioned flow matching model for protein structure generation.<n>We train FoldFlow-2 at scale on a new dataset that is an order of magnitude larger than PDB datasets of prior works.<n>We empirically observe that FoldFlow-2 outperforms previous state-of-the-art protein structure-based generative models.
arXiv Detail & Related papers (2024-05-30T17:53:50Z) - gRNAde: Geometric Deep Learning for 3D RNA inverse design [14.729049204432027]
gRNAde is a geometric RNA design pipeline operating on 3D RNA backbones.<n>It generates sequences that explicitly account for structure and dynamics.
arXiv Detail & Related papers (2023-05-24T05:46:56Z) - RDesign: Hierarchical Data-efficient Representation Learning for
Tertiary Structure-based RNA Design [65.41144149958208]
This study aims to systematically construct a data-driven RNA design pipeline.
We crafted a benchmark dataset and designed a comprehensive structural modeling approach to represent the complex RNA tertiary structure.
We incorporated extracted secondary structures with base pairs as prior knowledge to facilitate the RNA design process.
arXiv Detail & Related papers (2023-01-25T17:19:49Z) - Accurate RNA 3D structure prediction using a language model-based deep learning approach [50.193512039121984]
RhoFold+ is an RNA language model-based deep learning method that accurately predicts 3D structures of single-chain RNAs from sequences.<n>RhoFold+ offers a fully automated end-to-end pipeline for RNA 3D structure prediction.
arXiv Detail & Related papers (2022-07-04T17:15:35Z) - Improving RNA Secondary Structure Design using Deep Reinforcement
Learning [69.63971634605797]
We propose a new benchmark of applying reinforcement learning to RNA sequence design, in which the objective function is defined to be the free energy in the sequence's secondary structure.
We show results of the ablation analysis that we do for these algorithms, as well as graphs indicating the algorithm's performance across batches.
arXiv Detail & Related papers (2021-11-05T02:54:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.