UniZyme: A Unified Protein Cleavage Site Predictor Enhanced with Enzyme Active-Site Knowledge
- URL: http://arxiv.org/abs/2502.06914v2
- Date: Wed, 12 Feb 2025 16:47:32 GMT
- Title: UniZyme: A Unified Protein Cleavage Site Predictor Enhanced with Enzyme Active-Site Knowledge
- Authors: Chenao Li, Shuo Yan, Enyan Dai,
- Abstract summary: We introduce a unified protein cleavage site predictor named UniZyme, which can generalize across diverse enzymes.
Experiments demonstrate that UniZyme achieves high accuracy in predicting cleavage sites across a range of proteolytic enzymes.
- Score: 10.678089839728889
- License:
- Abstract: Enzyme-catalyzed protein cleavage is essential for many biological functions. Accurate prediction of cleavage sites can facilitate various applications such as drug development, enzyme design, and a deeper understanding of biological mechanisms. However, most existing models are restricted to an individual enzyme, which neglects shared knowledge of enzymes and fails generalize to novel enzymes. Thus, we introduce a unified protein cleavage site predictor named UniZyme, which can generalize across diverse enzymes. To enhance the enzyme encoding for the protein cleavage site prediction, UniZyme employs a novel biochemically-informed model architecture along with active-site knowledge of proteolytic enzymes. Extensive experiments demonstrate that UniZyme achieves high accuracy in predicting cleavage sites across a range of proteolytic enzymes, including unseen enzymes. The code is available in https://anonymous.4open.science/r/UniZyme-4A67.
Related papers
- Interpretable Enzyme Function Prediction via Residue-Level Detection [58.30647671797602]
We present an attention-based framework, namely ProtDETR, for enzyme function prediction.
It uses a set of learnable functional queries to adaptatively extract different local representations from the sequence of residue-level features.
ProtDETR significantly outperforms existing deep learning-based enzyme function prediction methods.
arXiv Detail & Related papers (2025-01-10T01:02:43Z) - Reaction-conditioned De Novo Enzyme Design with GENzyme [64.14088142258498]
textscGENzyme is a textitde novo enzyme design model that takes a catalytic reaction as input and generates the catalytic pocket, full enzyme structure, and enzyme-substrate binding complex.
textscGENzyme is an end-to-end, three-staged model that integrates (1) a catalytic pocket generation and sequence co-design module, (2) a pocket inpainting and enzyme inverse folding module, and (3) a binding and screening module to optimize and predict enzyme-substrate complexes.
arXiv Detail & Related papers (2024-11-10T00:37:26Z) - EnzymeFlow: Generating Reaction-specific Enzyme Catalytic Pockets through Flow Matching and Co-Evolutionary Dynamics [51.47520281819253]
Enzyme design is a critical area in biotechnology, with applications ranging from drug development to synthetic biology.
Traditional methods for enzyme function prediction or protein binding pocket design often fall short in capturing the dynamic and complex nature of enzyme-substrate interactions.
We introduce EnzymeFlow, a generative model that employs flow matching with hierarchical pre-training and enzyme-reaction co-evolution to generate catalytic pockets.
arXiv Detail & Related papers (2024-10-01T02:04:01Z) - ReactZyme: A Benchmark for Enzyme-Reaction Prediction [41.33939896203491]
We introduce a new approach to annotating enzymes based on their catalyzed reactions.
We employ machine learning algorithms to analyze enzyme reaction datasets.
We frame the enzyme-reaction prediction as a retrieval problem, aiming to rank enzymes by their catalytic ability for specific reactions.
arXiv Detail & Related papers (2024-08-24T19:19:33Z) - Generative Enzyme Design Guided by Functionally Important Sites and Small-Molecule Substrates [16.5169461287914]
We propose EnzyGen, an approach to learn a unified model to design enzymes across all functional families.
Our key idea is to generate an enzyme's amino acid sequence and their 3D coordinates based on functionally important sites and substrates corresponding to a desired catalytic function.
arXiv Detail & Related papers (2024-05-13T21:48:48Z) - ProLLM: Protein Chain-of-Thoughts Enhanced LLM for Protein-Protein Interaction Prediction [54.132290875513405]
The prediction of protein-protein interactions (PPIs) is crucial for understanding biological functions and diseases.
Previous machine learning approaches to PPI prediction mainly focus on direct physical interactions.
We propose a novel framework ProLLM that employs an LLM tailored for PPI for the first time.
arXiv Detail & Related papers (2024-03-30T05:32:42Z) - A Latent Diffusion Model for Protein Structure Generation [50.74232632854264]
We propose a latent diffusion model that can reduce the complexity of protein modeling.
We show that our method can effectively generate novel protein backbone structures with high designability and efficiency.
arXiv Detail & Related papers (2023-05-06T19:10:19Z) - Machine learning modeling of family wide enzyme-substrate specificity
screens [2.276367922551686]
Biocatalysis is a promising approach to synthesize pharmaceuticals, complex natural products, and commodity chemicals at scale.
The adoption of biocatalysis is limited by our ability to select enzymes that will catalyze their natural chemical transformation on non-natural substrates.
arXiv Detail & Related papers (2021-09-08T19:44:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.