Related papers: Mimetic Neural Networks: A unified framework for Protein Design and Folding

Related papers

ProteinZero: Self-Improving Protein Generation via Online Reinforcement Learning [49.2607661375311]
We present ProteinZero, a novel framework that enables computationally scalable, automated, and continuous self-improvement of the inverse folding model.<n>ProteinZero substantially outperforms existing methods across every key metric in protein design.<n> Notably, the entire RL run on CATH-4.3 can be done with a single 8 X GPU node in under 3 days, including reward.
arXiv Detail & Related papers (2025-06-09T06:08:59Z)
Protein Design with Dynamic Protein Vocabulary [22.358650729894443]
We introduce ProDVa, a novel protein design approach that integrates a text encoder for functional descriptions, a protein language model for designing proteins, and a fragment encoder to dynamically retrieve protein fragments.<n>Compared to state-of-the-art models, ProDVa achieves comparable function alignment using less than 0.04% of the training data, while designing significantly more well-folded proteins.
arXiv Detail & Related papers (2025-05-25T03:50:50Z)
ProteinWeaver: A Divide-and-Assembly Approach for Protein Backbone Design [61.19456204667385]
We introduce ProteinWeaver, a two-stage framework for protein backbone design. ProteinWeaver generates high-quality, novel protein backbones through versatile domain assembly. By introducing a divide-and-assembly' paradigm, ProteinWeaver advances protein engineering and opens new avenues for functional protein design.
arXiv Detail & Related papers (2024-11-08T08:10:49Z)
Model-based reinforcement learning for protein backbone design [1.7383284836821535]
We propose the use of AlphaZero to generate protein backbones, meeting shape and structural scoring requirements. We extend an existing Monte Carlo tree search (MCTS) framework by incorporating a novel threshold-based reward and secondary objectives. AlphaZero consistently surpasses baseline MCTS by more than 100% in top-down protein design tasks.
arXiv Detail & Related papers (2024-05-03T10:24:33Z)
NaNa and MiGu: Semantic Data Augmentation Techniques to Enhance Protein Classification in Graph Neural Networks [60.48306899271866]
We propose novel semantic data augmentation methods to incorporate backbone chemical and side-chain biophysical information into protein classification tasks. Specifically, we leverage molecular biophysical, secondary structure, chemical bonds, andionic features of proteins to facilitate classification tasks.
arXiv Detail & Related papers (2024-03-21T13:27:57Z)
Enhancing Protein Predictive Models via Proteins Data Augmentation: A Benchmark and New Directions [58.819567030843025]
This paper extends data augmentation techniques previously used for images and texts to proteins and then benchmarks these techniques on a variety of protein-related tasks. We propose two novel semantic-level protein augmentation methods, namely Integrated Gradients Substitution and Back Translation Substitution. Finally, we integrate extended and proposed augmentations into an augmentation pool and propose a simple but effective framework, namely Automated Protein Augmentation (APA)
arXiv Detail & Related papers (2024-03-01T07:58:29Z)
Structure-informed Language Models Are Protein Designers [69.70134899296912]
We present LM-Design, a generic approach to reprogramming sequence-based protein language models (pLMs) We conduct a structural surgery on pLMs, where a lightweight structural adapter is implanted into pLMs and endows it with structural awareness. Experiments show that our approach outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2023-02-03T10:49:52Z)
Generating Novel, Designable, and Diverse Protein Structures by Equivariantly Diffusing Oriented Residue Clouds [0.0]
Structure-based protein design aims to find structures that are designable, novel, and diverse. Generative models provide a compelling alternative, by implicitly learning the low-dimensional structure of complex data. We develop Genie, a generative model of protein structures that performs discrete-time diffusion using a cloud of oriented reference frames in 3D space.
arXiv Detail & Related papers (2023-01-29T16:44:19Z)
Learning the shape of protein micro-environments with a holographic convolutional neural network [0.0]
We introduce Holographic Convolutional Neural Network (H-CNN) for proteins. H-CNN is a physically motivated machine learning approach to model amino acid preferences in protein structures. It accurately predicts the impact of mutations on protein function, including stability and binding of protein complexes.
arXiv Detail & Related papers (2022-11-05T16:29:15Z)
Contrastive Representation Learning for 3D Protein Structures [13.581113136149469]
We introduce a new representation learning framework for 3D protein structures. Our framework uses unsupervised contrastive learning to learn meaningful representations of protein structures. We show, how these representations can be used to solve a large variety of tasks, such as protein function prediction, protein fold classification, structural similarity prediction, and protein-ligand binding affinity prediction.
arXiv Detail & Related papers (2022-05-31T10:33:06Z)
Learning Geometrically Disentangled Representations of Protein Folding Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein. Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules. Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z)
Structure-aware Protein Self-supervised Learning [50.04673179816619]
We propose a novel structure-aware protein self-supervised learning method to capture structural information of proteins. In particular, a well-designed graph neural network (GNN) model is pretrained to preserve the protein structural information. We identify the relation between the sequential information in the protein language model and the structural information in the specially designed GNN model via a novel pseudo bi-level optimization scheme.
arXiv Detail & Related papers (2022-04-06T02:18:41Z)
Transfer Learning for Protein Structure Classification at Low Resolution [124.5573289131546]
We show that it is possible to make accurate ($geq$80%) predictions of protein class and architecture from structures determined at low ($leq$3A) resolution. We provide proof of concept for high-speed, low-cost protein structure classification at low resolution, and a basis for extension to prediction of function.
arXiv Detail & Related papers (2020-08-11T15:01:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.