Diffusion Model with Representation Alignment for Protein Inverse Folding
- URL: http://arxiv.org/abs/2412.09380v1
- Date: Thu, 12 Dec 2024 15:47:59 GMT
- Title: Diffusion Model with Representation Alignment for Protein Inverse Folding
- Authors: Chenglin Wang, Yucheng Zhou, Zijie Zhai, Jianbing Shen, Kai Zhang,
- Abstract summary: Protein inverse folding is a fundamental problem in bioinformatics, aiming to recover the amino acid sequences from a given protein backbone structure.
We propose a novel method that leverages diffusion models with representation alignment (DMRA)
In experiments, we conduct extensive evaluations on the CATH4.2 dataset to demonstrate that DMRA outperforms leading methods.
- Score: 53.139837825588614
- License:
- Abstract: Protein inverse folding is a fundamental problem in bioinformatics, aiming to recover the amino acid sequences from a given protein backbone structure. Despite the success of existing methods, they struggle to fully capture the intricate inter-residue relationships critical for accurate sequence prediction. We propose a novel method that leverages diffusion models with representation alignment (DMRA), which enhances diffusion-based inverse folding by (1) proposing a shared center that aggregates contextual information from the entire protein structure and selectively distributes it to each residue; and (2) aligning noisy hidden representations with clean semantic representations during the denoising process. This is achieved by predefined semantic representations for amino acid types and a representation alignment method that utilizes type embeddings as semantic feedback to normalize each residue. In experiments, we conduct extensive evaluations on the CATH4.2 dataset to demonstrate that DMRA outperforms leading methods, achieving state-of-the-art performance and exhibiting strong generalization capabilities on the TS50 and TS500 datasets.
Related papers
- Mask prior-guided denoising diffusion improves inverse protein folding [3.1373465343833704]
Inverse protein folding generates valid amino acid sequences that can fold into a desired protein structure.
We propose a framework that captures both structural and residue interactions for inverse protein folding.
MapDiff is a discrete diffusion probabilistic model that iteratively generates amino acid sequences with reduced noise.
arXiv Detail & Related papers (2024-12-10T09:10:28Z) - SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation [97.99658944212675]
We introduce a novel pre-training strategy for protein foundation models.
It emphasizes the interactions among amino acid residues to enhance the extraction of both short-range and long-range co-evolutionary features.
Trained on a large-scale protein sequence dataset, our model demonstrates superior generalization ability.
arXiv Detail & Related papers (2024-10-31T15:22:03Z) - ArSDM: Colonoscopy Images Synthesis with Adaptive Refinement Semantic
Diffusion Models [69.9178140563928]
Colonoscopy analysis is essential for assisting clinical diagnosis and treatment.
The scarcity of annotated data limits the effectiveness and generalization of existing methods.
We propose an Adaptive Refinement Semantic Diffusion Model (ArSDM) to generate colonoscopy images that benefit the downstream tasks.
arXiv Detail & Related papers (2023-09-03T07:55:46Z) - DiAMoNDBack: Diffusion-denoising Autoregressive Model for
Non-Deterministic Backmapping of C{\alpha} Protein Traces [0.0]
DiAMoNDBack is an autoregressive denoising diffusion probability model for non-Deterministic Backmapping.
We train DiAMoNDBack over 65k+ structures from Protein Data Bank (PDB) and validate it in applications to a hold-out PDB test set.
We make DiAMoNDBack publicly available as a free and open source Python package.
arXiv Detail & Related papers (2023-07-23T23:05:08Z) - Graph Denoising Diffusion for Inverse Protein Folding [15.06549999760776]
Inverse protein folding is challenging due to its inherent one-to-many mapping characteristic.
We propose a novel graph denoising diffusion model for inverse protein folding.
Our model achieves state-of-the-art performance over a set of popular baseline methods in sequence recovery.
arXiv Detail & Related papers (2023-06-29T09:55:30Z) - Protein Design with Guided Discrete Diffusion [67.06148688398677]
A popular approach to protein design is to combine a generative model with a discriminative model for conditional sampling.
We propose diffusioN Optimized Sampling (NOS), a guidance method for discrete diffusion models.
NOS makes it possible to perform design directly in sequence space, circumventing significant limitations of structure-based methods.
arXiv Detail & Related papers (2023-05-31T16:31:24Z) - DDS2M: Self-Supervised Denoising Diffusion Spatio-Spectral Model for
Hyperspectral Image Restoration [103.79030498369319]
Self-supervised diffusion model for hyperspectral image restoration is proposed.
textttDDS2M enjoys stronger ability to generalization compared to existing diffusion-based methods.
Experiments on HSI denoising, noisy HSI completion and super-resolution on a variety of HSIs demonstrate textttDDS2M's superiority over the existing task-specific state-of-the-arts.
arXiv Detail & Related papers (2023-03-12T14:57:04Z) - Bayesian Reconstruction and Differential Testing of Excised mRNA [0.0]
We develop the first probabilistic model that reconciles the transcript and local splicing perspectives.
We present a novel hierarchical Bayesian admixture model for the Reconstruction of Excised mRNA (BREM)
BREM interpolates between local splicing events and full-length transcripts and thus focuses only on SMEs that have high posterior probability.
arXiv Detail & Related papers (2022-11-14T04:46:33Z) - Orthogonalization of data via Gromov-Wasserstein type feedback for
clustering and visualization [5.44192123671277]
We propose an adaptive approach for clustering and visualization of data by an orthogonalization process.
We prove that the method converges globally to a unique fixpoint for certain parameter values.
We confirm that the method produces biologically meaningful clustering results consistent with human expert classification.
arXiv Detail & Related papers (2022-07-25T15:52:11Z) - Unsupervised Contrastive Domain Adaptation for Semantic Segmentation [75.37470873764855]
We introduce contrastive learning for feature alignment in cross-domain adaptation.
The proposed approach consistently outperforms state-of-the-art methods for domain adaptation.
It achieves 60.2% mIoU on the Cityscapes dataset.
arXiv Detail & Related papers (2022-04-18T16:50:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.