A Comprehensive Benchmark for RNA 3D Structure-Function Modeling
- URL: http://arxiv.org/abs/2503.21681v1
- Date: Thu, 27 Mar 2025 16:49:31 GMT
- Title: A Comprehensive Benchmark for RNA 3D Structure-Function Modeling
- Authors: Luis Wyss, Vincent Mallet, Wissam Karroucha, Karsten Borgwardt, Carlos Oliver,
- Abstract summary: We introduce a set of seven benchmarking datasets for RNA structure-function prediction.<n>Our library builds on the established Python library rnaglib, and offers easy data distribution and encoding.<n>We provide initial baseline results for all tasks using a graph neural network.
- Score: 1.3980986259786223
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The RNA structure-function relationship has recently garnered significant attention within the deep learning community, promising to grow in importance as nucleic acid structure models advance. However, the absence of standardized and accessible benchmarks for deep learning on RNA 3D structures has impeded the development of models for RNA functional characteristics. In this work, we introduce a set of seven benchmarking datasets for RNA structure-function prediction, designed to address this gap. Our library builds on the established Python library rnaglib, and offers easy data distribution and encoding, splitters and evaluation methods, providing a convenient all-in-one framework for comparing models. Datasets are implemented in a fully modular and reproducible manner, facilitating for community contributions and customization. Finally, we provide initial baseline results for all tasks using a graph neural network. Source code: https://github.com/cgoliver/rnaglib Documentation: https://rnaglib.org
Related papers
- Differentiable Folding for Nearest Neighbor Model Optimization [0.6291443816903801]
The Nearest Neighbor model is the $textitde facto$ thermodynamic model of RNA secondary structure formation.
Here, we leverage recent advances in $textitdifferentiable folding$ to devise an efficient, scalable, and flexible means of parameter optimization.
Our method yields a significantly improved parameter set that outperforms existing baselines on all metrics.
arXiv Detail & Related papers (2025-03-12T05:36:12Z) - BEACON: Benchmark for Comprehensive RNA Tasks and Language Models [60.02663015002029]
We introduce the first comprehensive RNA benchmark BEACON (textbfBEnchmtextbfArk for textbfCOmprehensive RtextbfNA Task and Language Models).<n>First, BEACON comprises 13 distinct tasks derived from extensive previous work covering structural analysis, functional studies, and engineering applications.<n>Second, we examine a range of models, including traditional approaches like CNNs, as well as advanced RNA foundation models based on language models, offering valuable insights into the task-specific performances of these models.<n>Third, we investigate the vital RNA language model components
arXiv Detail & Related papers (2024-06-14T19:39:19Z) - 3D-based RNA function prediction tools in rnaglib [2.048226951354646]
Building datasets of RNA 3D structures and making appropriate modeling choices remains time-consuming and lacks standardization.
We describe the use of rnaglib, to train supervised and unsupervised machine learning-based function prediction models on datasets of RNA 3D structures.
arXiv Detail & Related papers (2024-02-14T17:22:03Z) - RDesign: Hierarchical Data-efficient Representation Learning for
Tertiary Structure-based RNA Design [65.41144149958208]
This study aims to systematically construct a data-driven RNA design pipeline.
We crafted a benchmark dataset and designed a comprehensive structural modeling approach to represent the complex RNA tertiary structure.
We incorporated extracted secondary structures with base pairs as prior knowledge to facilitate the RNA design process.
arXiv Detail & Related papers (2023-01-25T17:19:49Z) - Physics-aware Graph Neural Network for Accurate RNA 3D Structure
Prediction [20.276492931562036]
Given the limited number of experimentally determined RNA structures, the prediction of RNA structures will facilitate elucidating RNA functions and RNA-targeted drug discovery.
We propose a Graph Neural Network (GNN)-based scoring function trained only with the atomic types.
The proposed Physics-aware Multiplex Graph Neural Network (PaxNet) separately models the local and non-local interactions.
arXiv Detail & Related papers (2022-10-28T20:13:00Z) - Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs)
NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge.
NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z) - Accurate RNA 3D structure prediction using a language model-based deep learning approach [50.193512039121984]
RhoFold+ is an RNA language model-based deep learning method that accurately predicts 3D structures of single-chain RNAs from sequences.<n>RhoFold+ offers a fully automated end-to-end pipeline for RNA 3D structure prediction.
arXiv Detail & Related papers (2022-07-04T17:15:35Z) - Improving RNA Secondary Structure Design using Deep Reinforcement
Learning [69.63971634605797]
We propose a new benchmark of applying reinforcement learning to RNA sequence design, in which the objective function is defined to be the free energy in the sequence's secondary structure.
We show results of the ablation analysis that we do for these algorithms, as well as graphs indicating the algorithm's performance across batches.
arXiv Detail & Related papers (2021-11-05T02:54:06Z) - Rank-R FNN: A Tensor-Based Learning Model for High-Order Data
Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters.
First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension.
We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.