A Multi-Modal Contrastive Diffusion Model for Therapeutic Peptide
Generation
- URL: http://arxiv.org/abs/2312.15665v2
- Date: Thu, 4 Jan 2024 02:32:33 GMT
- Title: A Multi-Modal Contrastive Diffusion Model for Therapeutic Peptide
Generation
- Authors: Yongkang Wang, Xuan Liu, Feng Huang, Zhankun Xiong, Wen Zhang
- Abstract summary: We propose a Multi-Modal Contrastive Diffusion model, fusing both sequence and structure modalities in a diffusion framework to co-generate novel peptide sequences and structures.
MMCD performs better than other state-of-theart deep generative methods in generating therapeutic peptides across various metrics, including antimicrobial/anticancer score, diversity, and peptide-docking.
- Score: 7.779658935195194
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Therapeutic peptides represent a unique class of pharmaceutical agents
crucial for the treatment of human diseases. Recently, deep generative models
have exhibited remarkable potential for generating therapeutic peptides, but
they only utilize sequence or structure information alone, which hinders the
performance in generation. In this study, we propose a Multi-Modal Contrastive
Diffusion model (MMCD), fusing both sequence and structure modalities in a
diffusion framework to co-generate novel peptide sequences and structures.
Specifically, MMCD constructs the sequence-modal and structure-modal diffusion
models, respectively, and devises a multi-modal contrastive learning strategy
with intercontrastive and intra-contrastive in each diffusion timestep, aiming
to capture the consistency between two modalities and boost model performance.
The inter-contrastive aligns sequences and structures of peptides by maximizing
the agreement of their embeddings, while the intra-contrastive differentiates
therapeutic and non-therapeutic peptides by maximizing the disagreement of
their sequence/structure embeddings simultaneously. The extensive experiments
demonstrate that MMCD performs better than other state-of-theart deep
generative methods in generating therapeutic peptides across various metrics,
including antimicrobial/anticancer score, diversity, and peptide-docking.
Related papers
- MIND: Modality-Informed Knowledge Distillation Framework for Multimodal Clinical Prediction Tasks [50.98856172702256]
We propose the Modality-INformed knowledge Distillation (MIND) framework, a multimodal model compression approach.
MIND transfers knowledge from ensembles of pre-trained deep neural networks of varying sizes into a smaller multimodal student.
We evaluate MIND on binary and multilabel clinical prediction tasks using time series data and chest X-ray images.
arXiv Detail & Related papers (2025-02-03T08:50:00Z) - PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion [2.6668932659159905]
We present PepTune, a multi-objective discrete diffusion model for the simultaneous generation and optimization of therapeutic peptide SMILES.
We generate diverse, chemically-modified peptides optimized for multiple therapeutic properties, including target binding affinity, membrane permeability, solubility, hemolysis, and non-fouling characteristics.
In total, our results demonstrate that PepTune is a powerful and modular approach for multi-objective sequence design in discrete state spaces.
arXiv Detail & Related papers (2024-12-23T18:38:49Z) - Structure Language Models for Protein Conformation Generation [66.42864253026053]
Traditional physics-based simulation methods often struggle with sampling equilibrium conformations.
Deep generative models have shown promise in generating protein conformations as a more efficient alternative.
We introduce Structure Language Modeling as a novel framework for efficient protein conformation generation.
arXiv Detail & Related papers (2024-10-24T03:38:51Z) - DPLM-2: A Multimodal Diffusion Protein Language Model [75.98083311705182]
We introduce DPLM-2, a multimodal protein foundation model that extends discrete diffusion protein language model (DPLM) to accommodate both sequences and structures.
DPLM-2 learns the joint distribution of sequence and structure, as well as their marginals and conditionals.
Empirical evaluation shows that DPLM-2 can simultaneously generate highly compatible amino acid sequences and their corresponding 3D structures.
arXiv Detail & Related papers (2024-10-17T17:20:24Z) - PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation [51.509573838103854]
We propose a semi-supervised learning framework, termed Progressive Mean Teachers (PMT), for medical image segmentation.
Our PMT generates high-fidelity pseudo labels by learning robust and diverse features in the training process.
Experimental results on two datasets with different modalities, i.e., CT and MRI, demonstrate that our method outperforms the state-of-the-art medical image segmentation approaches.
arXiv Detail & Related papers (2024-09-08T15:02:25Z) - MoFormer: Multi-objective Antimicrobial Peptide Generation Based on Conditional Transformer Joint Multi-modal Fusion Descriptor [15.98003148948758]
We establish a multi-objective AMP synthesis pipeline (MoFormer) for the simultaneous optimization of multi-attributes of AMPs.
MoFormer improves the desired attributes of AMP sequences in a highly structured latent space, guided by conditional constraints and fine-grained multi-descriptor.
We show that MoFormer outperforms existing methods in the generation task of enhanced antimicrobial activity and minimal hemolysis.
arXiv Detail & Related papers (2024-06-03T07:17:18Z) - Diffusion on language model encodings for protein sequence generation [0.5182791771937247]
We present DiMA, a latent diffusion framework that operates on protein language model representations.
Our framework consistently produces novel, high-quality and diverse protein sequences.
It supports conditional generation tasks including protein family-generation, motif scaffolding and infilling, and fold-specific sequence design.
arXiv Detail & Related papers (2024-03-06T14:15:20Z) - MATE-Pred: Multimodal Attention-based TCR-Epitope interaction Predictor [1.933856957193398]
An accurate binding prediction between T-cell receptors ands contributes decisively to successful immunotherapy strategies.
Here, we propose a highly reliable novel method, MATE-Pred, that performs attention-based prediction of T-cell receptors and affinitys binding regimes.
The performance of MATE-Pred projects its potential application in drug discovery.
arXiv Detail & Related papers (2023-12-05T11:30:00Z) - Co-modeling the Sequential and Graphical Routes for Peptide
Representation Learning [67.66393016797181]
We propose a peptide co-modeling method, RepCon, to enhance the mutual information of representations from decoupled sequential and graphical end-to-end models.
RepCon learns to enhance the consistency of representations between positive sample pairs and to repel representations between negative pairs.
Our results demonstrate the superiority of the co-modeling approach over independent modeling, as well as the superiority of RepCon over other methods under the co-modeling framework.
arXiv Detail & Related papers (2023-10-04T16:58:25Z) - Diffusion Glancing Transformer for Parallel Sequence to Sequence
Learning [52.72369034247396]
We propose the diffusion glancing transformer, which employs a modality diffusion process and residual glancing sampling.
DIFFGLAT achieves better generation accuracy while maintaining fast decoding speed compared with both autoregressive and non-autoregressive models.
arXiv Detail & Related papers (2022-12-20T13:36:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.