Solvent: A Framework for Protein Folding
- URL: http://arxiv.org/abs/2307.04603v5
- Date: Mon, 31 Jul 2023 05:29:16 GMT
- Title: Solvent: A Framework for Protein Folding
- Authors: Jaemyung Lee, Kyeongtak Han, Jaehoon Kim, Hasun Yu, Youhan Lee
- Abstract summary: After AlphaFold2, the protein folding task has entered a new phase, and many methods are proposed based on the component of AlphaFold2.
The importance of a unified research framework in protein folding contains implementations and benchmarks to consistently and fairly compare various approaches.
We present solvent, a protein folding framework that supports significant components of state-of-the-art models in the manner of an off-the-shelf interface.
- Score: 0.39373541926236766
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Consistency and reliability are crucial for conducting AI research. Many
famous research fields, such as object detection, have been compared and
validated with solid benchmark frameworks. After AlphaFold2, the protein
folding task has entered a new phase, and many methods are proposed based on
the component of AlphaFold2. The importance of a unified research framework in
protein folding contains implementations and benchmarks to consistently and
fairly compare various approaches. To achieve this, we present Solvent, a
protein folding framework that supports significant components of
state-of-the-art models in the manner of an off-the-shelf interface Solvent
contains different models implemented in a unified codebase and supports
training and evaluation for defined models on the same dataset. We benchmark
well-known algorithms and their components and provide experiments that give
helpful insights into the protein structure modeling field. We hope that
Solvent will increase the reliability and consistency of proposed models and
give efficiency in both speed and costs, resulting in acceleration on protein
folding modeling research. The code is available at
https://github.com/kakaobrain/solvent, and the project will continue to be
developed.
Related papers
- ProteinBench: A Holistic Evaluation of Protein Foundation Models [53.59325047872512]
We introduce ProteinBench, a holistic evaluation framework for protein foundation models.
Our approach consists of three key components: (i) A taxonomic classification of tasks that broadly encompass the main challenges in the protein domain, based on the relationships between different protein modalities; (ii) A multi-metric evaluation approach that assesses performance across four key dimensions: quality, novelty, diversity, and robustness; and (iii) In-depth analyses from various user objectives, providing a holistic view of model performance.
arXiv Detail & Related papers (2024-09-10T06:52:33Z) - Progressive Multi-Modality Learning for Inverse Protein Folding [47.095862120116976]
We propose a novel protein design paradigm called MMDesign, which leverages multi-modality transfer learning.
MMDesign is the first framework that combines a pretrained structural module with a pretrained contextual module, using an auto-encoder (AE) based language model to incorporate prior protein semantic knowledge.
Experimental results, only training with the small dataset, demonstrate that MMDesign consistently outperforms baselines on various public benchmarks.
arXiv Detail & Related papers (2023-12-11T10:59:23Z) - Efficiently Predicting Protein Stability Changes Upon Single-point
Mutation with Large Language Models [51.57843608615827]
The ability to precisely predict protein thermostability is pivotal for various subfields and applications in biochemistry.
We introduce an ESM-assisted efficient approach that integrates protein sequence and structural features to predict the thermostability changes in protein upon single-point mutations.
arXiv Detail & Related papers (2023-12-07T03:25:49Z) - PDB-Struct: A Comprehensive Benchmark for Structure-based Protein Design [19.324059406159325]
We introduce two novel metrics: refoldability-based metric and stability-based metric.
ByProt, ProteinMPNN, and ESM-IF perform exceptionally well on our benchmark, while ESM-Design and AF-Design fall short.
Our proposed benchmark paves the way for a fair and comprehensive evaluation of protein design methods.
arXiv Detail & Related papers (2023-11-30T02:37:55Z) - FABind: Fast and Accurate Protein-Ligand Binding [127.7790493202716]
$mathbfFABind$ is an end-to-end model that combines pocket prediction and docking to achieve accurate and fast protein-ligand binding.
Our proposed model demonstrates strong advantages in terms of effectiveness and efficiency compared to existing methods.
arXiv Detail & Related papers (2023-10-10T16:39:47Z) - PDBench: Evaluating Computational Methods for Protein Sequence Design [2.0187324832551385]
We present a benchmark set of proteins and propose tests to assess the performance of deep learning based methods.
Our robust benchmark provides biological insight into the behaviour of design methods, which is essential for evaluating their performance and utility.
arXiv Detail & Related papers (2021-09-16T12:20:03Z) - Improving Label Quality by Jointly Modeling Items and Annotators [68.8204255655161]
We propose a fully Bayesian framework for learning ground truth labels from noisy annotators.
Our framework ensures scalability by factoring a generative, Bayesian soft clustering model over label distributions into the classic David and Skene joint annotator-data model.
arXiv Detail & Related papers (2021-06-20T02:15:20Z) - A generalized framework for active learning reliability: survey and
benchmark [0.0]
We propose a modular framework to build on-the-fly efficient active learning strategies.
We devise 39 strategies for the solution of 20 reliability benchmark problems.
arXiv Detail & Related papers (2021-06-03T09:33:59Z) - Energy-based models for atomic-resolution protein conformations [88.68597850243138]
We propose an energy-based model (EBM) of protein conformations that operates at atomic scale.
The model is trained solely on crystallized protein data.
An investigation of the model's outputs and hidden representations finds that it captures physicochemical properties relevant to protein energy.
arXiv Detail & Related papers (2020-04-27T20:45:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.