Morphology-Specific Peptide Discovery via Masked Conditional Generative Modeling
- URL: http://arxiv.org/abs/2509.02060v2
- Date: Thu, 04 Sep 2025 08:13:53 GMT
- Title: Morphology-Specific Peptide Discovery via Masked Conditional Generative Modeling
- Authors: Nuno Costa, Julija Zavadlav,
- Abstract summary: PepMorph is an end-to-end peptide discovery pipeline.<n>It generates sequences prone to aggregate but self-assemble into a specified fibrillar or spherical morphology.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Peptide self-assembly prediction offers a powerful bottom-up strategy for designing biocompatible, low-toxicity materials for large-scale synthesis in a broad range of biomedical and energy applications. However, screening the vast sequence space for categorization of aggregate morphology remains intractable. We introduce PepMorph, an end-to-end peptide discovery pipeline that generates novel sequences that are not only prone to aggregate but self-assemble into a specified fibrillar or spherical morphology. We compiled a new dataset by leveraging existing aggregation propensity datasets and extracting geometric and physicochemical isolated peptide descriptors that act as proxies for aggregate morphology. This dataset is then used to train a Transformer-based Conditional Variational Autoencoder with a masking mechanism, which generates novel peptides under arbitrary conditioning. After filtering to ensure design specifications and validation of generated sequences through coarse-grained molecular dynamics simulations, PepMorph yielded 83% accuracy in intended morphology generation, showcasing its promise as a framework for application-driven peptide discovery.
Related papers
- Surface-based Molecular Design with Multi-modal Flow Matching [64.00572241268597]
SurfFlow is a novel surface-based generative algorithm that enables comprehensive co-design of sequence, structure, and surface for peptides.<n> evaluated on the comprehensive PepMerge benchmark, SurfFlow consistently outperforms full-atom baselines across all metrics.
arXiv Detail & Related papers (2026-01-08T02:19:29Z) - Probabilistic Predictions of Process-Induced Deformation in Carbon/Epoxy Composites Using a Deep Operator Network [7.616136432212582]
Fiber reinforcement and polymer matrix respond differently to manufacturing conditions due to mismatch in coefficient of thermal expansion and matrix shrinkage during curing of thermosets.<n>This study considers a unidirectional AS4 carbon fiber/amine bi-functional epoxy prepreg and models process-induced deformation (PID) using a two-mechanism framework.
arXiv Detail & Related papers (2025-12-15T03:04:45Z) - Zero-Shot Cyclic Peptide Design via Composable Geometric Constraints [65.77915791312634]
We propose CP-Composer, a novel generative framework that enables zero-shot cyclic peptide generation.<n>Our approach decomposes complex cyclization patterns into unit constraints, which are incorporated into a diffusion model.<n>Our model, despite trained with linear peptides, is capable of generating diverse target-binding cyclic peptides, reaching success rates from 38% to 84%.
arXiv Detail & Related papers (2025-07-06T03:30:45Z) - CreoPep: A Universal Deep Learning Framework for Target-Specific Peptide Design and Optimization [19.795752582745397]
Target-specific peptides, such as conotoxins, exhibit exceptional binding affinity and selectivity toward ion channels and receptors.<n>Here, we present CreoPep, a deep learning-based conditional generative framework that integrates masked language modeling with a progressive masking scheme to design high-affinity peptide mutants.<n>We validate this approach by designing conotoxin inhibitors targeting the $alpha$7 nicotinic acetylcholine receptor, achieving submicromolar potency in electrophysiological tests.
arXiv Detail & Related papers (2025-05-05T15:56:39Z) - Life-Code: Central Dogma Modeling with Multi-Omics Sequence Unification [55.98854157265578]
Life-Code is a comprehensive framework that spans different biological functions.<n>We propose a unified pipeline to integrate multi-omics data by reverse-transcribing RNA and reverse-translating amino acids into nucleotide-based sequences.<n>Life-Code achieves state-of-the-art results on various tasks across three omics, highlighting its potential for advancing multi-omics analysis and interpretation.
arXiv Detail & Related papers (2025-02-11T06:53:59Z) - Computational design of target-specific linear peptide binders with TransformerBeta [0.0]
We build an unprecedentedly large-scale library of peptide pairs within stable secondary structures (beta sheets)
We then developed a machine learning method based on the Transformer architecture for the design of specific linear binders.
arXiv Detail & Related papers (2024-10-07T08:52:54Z) - Exploring Latent Space for Generating Peptide Analogs Using Protein Language Models [1.5146068448101742]
The proposed method requires only a single sequence of interest, avoiding the need for large datasets.
Our results show significant improvements over baseline models in similarity indicators of peptide structures, descriptors and bioactivities.
arXiv Detail & Related papers (2024-08-15T13:37:27Z) - Molecule Design by Latent Prompt Transformer [76.2112075557233]
This work explores the challenging problem of molecule design by framing it as a conditional generative modeling task.
We propose a novel generative model comprising three components: (1) a latent vector with a learnable prior distribution; (2) a molecule generation model based on a causal Transformer, which uses the latent vector as a prompt; and (3) a property prediction model that predicts a molecule's target properties and/or constraint values using the latent prompt.
arXiv Detail & Related papers (2024-02-27T03:33:23Z) - MorphGrower: A Synchronized Layer-by-layer Growing Approach for Plausible Neuronal Morphology Generation [38.87351909710185]
This paper proposes MorphGrower, which mimicks the neuron natural growth mechanism for generation.
MorphGrower generates morphologies layer by layer, with each subsequent layer conditioned on the previously generated structure.
Results on four real-world datasets demonstrate that MorphGrower outperforms MorphVAE by a notable margin.
arXiv Detail & Related papers (2024-01-17T09:03:14Z) - Efficient Prediction of Peptide Self-assembly through Sequential and
Graphical Encoding [57.89530563948755]
This work provides a benchmark analysis of peptide encoding with advanced deep learning models.
It serves as a guide for a wide range of peptide-related predictions such as isoelectric points, hydration free energy, etc.
arXiv Detail & Related papers (2023-07-17T00:43:33Z) - Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation.
We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria.
Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.