VeRNAl: Mining RNA Structures for Fuzzy Base Pairing Network Motifs
- URL: http://arxiv.org/abs/2009.00664v3
- Date: Mon, 18 Oct 2021 14:45:41 GMT
- Title: VeRNAl: Mining RNA Structures for Fuzzy Base Pairing Network Motifs
- Authors: Carlos Oliver, Vincent Mallet, Pericles Philippopoulos, William L.
Hamilton, Jerome Waldispuhl
- Abstract summary: RNA 3D motifs are recurrent substructures modelled as networks of base pair interactions.
We propose a set of node similarity functions, clustering methods, and motif construction algorithms to recover flexible RNA motifs.
VeRNAl can be easily customized by users to desired levels of motif flexibility, abundance and size.
- Score: 13.990800077082843
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: RNA 3D motifs are recurrent substructures, modelled as networks of base pair
interactions, which are crucial for understanding structure-function
relationships. The task of automatically identifying such motifs is
computationally hard, and remains a key challenge in the field of RNA
structural biology and network analysis. State of the art methods solve special
cases of the motif problem by constraining the structural variability in
occurrences of a motif, and narrowing the substructure search space. Here, we
relax these constraints by posing the motif finding problem as a graph
representation learning and clustering task. This framing takes advantage of
the continuous nature of graph representations to model the flexibility and
variability of RNA motifs in an efficient manner. We propose a set of node
similarity functions, clustering methods, and motif construction algorithms to
recover flexible RNA motifs. Our tool, VeRNAl can be easily customized by users
to desired levels of motif flexibility, abundance and size. We show that VeRNAl
is able to retrieve and expand known classes of motifs, as well as to propose
novel motifs.
Related papers
- BEACON: Benchmark for Comprehensive RNA Tasks and Language Models [60.02663015002029]
We introduce the first comprehensive RNA benchmark BEACON (textbfBEnchmtextbfArk for textbfCOmprehensive RtextbfNA Task and Language Models).
First, BEACON comprises 13 distinct tasks derived from extensive previous work covering structural analysis, functional studies, and engineering applications.
Second, we examine a range of models, including traditional approaches like CNNs, as well as advanced RNA foundation models based on language models, offering valuable insights into the task-specific performances of these models.
Third, we investigate the vital RNA language model components
arXiv Detail & Related papers (2024-06-14T19:39:19Z) - RNAFlow: RNA Structure & Sequence Design via Inverse Folding-Based Flow Matching [7.600990806121113]
RNAFlow is a flow matching model for protein-conditioned RNA sequence-structure design.
Its denoising network integrates an RNA inverse folding model and a pre-trained RosettaFold2NA network for generation of RNA sequences and structures.
arXiv Detail & Related papers (2024-05-29T05:10:25Z) - 3D-based RNA function prediction tools in rnaglib [2.048226951354646]
Building datasets of RNA 3D structures and making appropriate modeling choices remains time-consuming and lacks standardization.
We describe the use of rnaglib, to train supervised and unsupervised machine learning-based function prediction models on datasets of RNA 3D structures.
arXiv Detail & Related papers (2024-02-14T17:22:03Z) - Defining Neural Network Architecture through Polytope Structures of Dataset [53.512432492636236]
This paper defines upper and lower bounds for neural network widths, which are informed by the polytope structure of the dataset in question.
We develop an algorithm to investigate a converse situation where the polytope structure of a dataset can be inferred from its corresponding trained neural networks.
It is established that popular datasets such as MNIST, Fashion-MNIST, and CIFAR10 can be efficiently encapsulated using no more than two polytopes with a small number of faces.
arXiv Detail & Related papers (2024-02-04T08:57:42Z) - DynGFN: Towards Bayesian Inference of Gene Regulatory Networks with
GFlowNets [81.75973217676986]
Gene regulatory networks (GRN) describe interactions between genes and their products that control gene expression and cellular function.
Existing methods either focus on challenge (1), identifying cyclic structure from dynamics, or on challenge (2) learning complex Bayesian posteriors over DAGs, but not both.
In this paper we leverage the fact that it is possible to estimate the "velocity" of gene expression with RNA velocity techniques to develop an approach that addresses both challenges.
arXiv Detail & Related papers (2023-02-08T16:36:40Z) - RDesign: Hierarchical Data-efficient Representation Learning for
Tertiary Structure-based RNA Design [65.41144149958208]
This study aims to systematically construct a data-driven RNA design pipeline.
We crafted a benchmark dataset and designed a comprehensive structural modeling approach to represent the complex RNA tertiary structure.
We incorporated extracted secondary structures with base pairs as prior knowledge to facilitate the RNA design process.
arXiv Detail & Related papers (2023-01-25T17:19:49Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - Classification of Long Noncoding RNA Elements Using Deep Convolutional
Neural Networks and Siamese Networks [17.8181080354116]
This thesis proposes a new methodemploying deep convolutional neural networks (CNNs) to classifyncRNA sequences.
As a result, clas-sifying RNA sequences is converted to an image classificationproblem that can be efficiently solved by CNN-basedclassification models.
arXiv Detail & Related papers (2021-02-10T17:26:38Z) - Neural representation and generation for RNA secondary structures [14.583976833366384]
Our work is concerned with the generation and targeted design of RNA, a type of genetic macromolecule.
The design of large scale and complex biological structures spurs dedicated graph-based deep generative modeling techniques.
We propose a flexible framework to jointly embed and generate different RNA structural modalities.
arXiv Detail & Related papers (2021-02-01T15:49:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.