PepEVOLVE: Position-Aware Dynamic Peptide Optimization via Group-Relative Advantage
- URL: http://arxiv.org/abs/2511.16912v1
- Date: Fri, 21 Nov 2025 02:51:15 GMT
- Title: PepEVOLVE: Position-Aware Dynamic Peptide Optimization via Group-Relative Advantage
- Authors: Trieu Nguyen, Hao-Wei Pang, Shasha Feng,
- Abstract summary: PepEVOLVE is a position-aware, dynamic framework that learns both where to edit and how to dynamically optimize peptides for multi-objective improvement.<n>On a therapeutically motivated Rev-binding macrocycle benchmark, PepEVOLVE outperformed PepINVENT by reaching higher mean scores.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Macrocyclic peptides are an emerging modality that combines biologics-like affinity with small-molecule-like developability, but their vast combinatorial space and multi-parameter objectives make lead optimization slow and challenging. Prior generative approaches such as PepINVENT require chemists to pre-specify mutable positions for optimization, choices that are not always known a priori, and rely on static pretraining and optimization algorithms that limit the model's ability to generalize and effectively optimize peptide sequences. We introduce PepEVOLVE, a position-aware, dynamic framework that learns both where to edit and how to dynamically optimize peptides for multi-objective improvement. PepEVOLVE (i) augments pretraining with dynamic masking and CHUCKLES shifting to improve generalization, (ii) uses a context-free multi-armed bandit router that discovers high-reward residues, and (iii) couples a novel evolving optimization algorithm with group-relative advantage to stabilize reinforcement updates. During in silico evaluations, the router policy reliably learns and concentrates probability on chemically meaningful sites that influence the peptide's properties. On a therapeutically motivated Rev-binding macrocycle benchmark, PepEVOLVE outperformed PepINVENT by reaching higher mean scores (approximately 0.8 vs. 0.6), achieving best candidates with a score of 0.95 (vs. 0.87), and converging in fewer steps under the task of optimizing permeability and lipophilicity with structural constraints. Overall, PepEVOLVE offers a practical, reproducible path to peptide lead optimization when optimal edit sites are unknown, enabling more efficient exploration and improving design quality across multiple objectives.
Related papers
- Soft Sequence Policy Optimization [0.0]
We introduce Soft Sequence Policy Optimization (SSPO) as an off-policy reinforcement learning objective.<n>SSPO incorporates soft gating functions over token-level probability ratios within sequence-level importance weights.<n>We show that SSPO improves training stability and performance in mathematical reasoning tasks.
arXiv Detail & Related papers (2026-02-22T20:21:00Z) - POP: Prior-fitted Optimizer Policies [20.784587787548436]
We introduce POP (Prior Policies Policies), a meta-learned model that predicts coordinate step-wise on contextual information.<n>Our model is learned on millions of synthetic optimization problems sampled from both nonfitted objectives.
arXiv Detail & Related papers (2026-02-17T10:27:07Z) - GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization [133.27496265096445]
We show how to apply Group Relative Policy Optimization under multi-reward setting without examining its suitability.<n>We then introduce Group reward-Decoupled Normalization Policy Optimization (GDPO), a new policy optimization method to resolve these issues.<n>GDPO consistently outperforms GRPO, demonstrating its effectiveness and generalizability for multi-reward reinforcement learning optimization.
arXiv Detail & Related papers (2026-01-08T18:59:24Z) - Anchoring Values in Temporal and Group Dimensions for Flow Matching Model Alignment [61.80228667422234]
VGPO redefines value estimation across both temporal and group dimensions.<n>It transforms the sparse terminal reward into dense, process-aware value estimates.<n>It replaces standard group normalization with a novel process enhanced by absolute values to maintain a stable optimization signal.
arXiv Detail & Related papers (2025-12-13T16:31:26Z) - PepThink-R1: LLM for Interpretable Cyclic Peptide Optimization with CoT SFT and Reinforcement Learning [5.484132643431736]
PepThink-R1 is a generative framework that integrates large language models with chain-of-thought supervised fine-tuning and reinforcement learning.<n>We demonstrate that PepThink-R1 generates cyclic peptides with significantly enhanced lipophilicity, stability, and exposure.
arXiv Detail & Related papers (2025-08-20T15:13:52Z) - DrugImproverGPT: A Large Language Model for Drug Optimization with Fine-Tuning via Structured Policy Optimization [53.27954325490941]
Finetuning a Large Language Model (LLM) is crucial for generating results towards specific objectives.<n>This research introduces a novel reinforcement learning algorithm to finetune a drug optimization LLM-based generative model.
arXiv Detail & Related papers (2025-02-11T04:00:21Z) - ScaffoldGPT: A Scaffold-based GPT Model for Drug Optimization [3.240904428766923]
We introduce ScaffoldGPT, a Generative Pretrained Transformer (GPT) for drug optimization based on molecular scaffolds.<n>A three-stage drug optimization approach integrates pretraining, finetuning, and decoding optimization.<n>We demonstrate via a comprehensive evaluation on COVID and cancer benchmarks that ScaffoldGPT outperforms the competing baselines in drug optimization benchmarks.
arXiv Detail & Related papers (2025-02-09T10:36:33Z) - Accelerated Preference Optimization for Large Language Model Alignment [60.22606527763201]
Reinforcement Learning from Human Feedback (RLHF) has emerged as a pivotal tool for aligning large language models (LLMs) with human preferences.
Direct Preference Optimization (DPO) formulates RLHF as a policy optimization problem without explicitly estimating the reward function.
We propose a general Accelerated Preference Optimization (APO) framework, which unifies many existing preference optimization algorithms.
arXiv Detail & Related papers (2024-10-08T18:51:01Z) - LightCPPgen: An Explainable Machine Learning Pipeline for Rational Design of Cell Penetrating Peptides [0.32985979395737786]
We introduce an innovative approach for the de novo design of CPPs, leveraging the strengths of machine learning (ML) and optimization algorithms.
Our strategy, named Light CPPgen, integrates a LightGBM-based predictive model with a genetic algorithm (GA)
The GA solutions specifically target the candidate sequences' penetrability score, while trying to maximize similarity with the original non-penetrating peptide.
arXiv Detail & Related papers (2024-05-31T10:57:25Z) - Advancements in Optimization: Adaptive Differential Evolution with
Diversification Strategy [0.0]
The study employs single-objective optimization in a two-dimensional space and runs ADEDS on each of the benchmark functions with multiple iterations.
ADEDS consistently outperforms standard DE for a variety of optimization challenges, including functions with numerous local optima, plate-shaped, valley-shaped, stretched-shaped, and noisy functions.
arXiv Detail & Related papers (2023-10-02T10:05:41Z) - Bidirectional Looking with A Novel Double Exponential Moving Average to
Adaptive and Non-adaptive Momentum Optimizers [109.52244418498974]
We propose a novel textscAdmeta (textbfADouble exponential textbfMov averagtextbfE textbfAdaptive and non-adaptive momentum) framework.
We provide two implementations, textscAdmetaR and textscAdmetaS, the former based on RAdam and the latter based on SGDM.
arXiv Detail & Related papers (2023-07-02T18:16:06Z) - Studying Evolutionary Solution Adaption Using a Flexibility Benchmark Based on a Metal Cutting Process [39.05320053926048]
We consider optimizing for different production requirements from the viewpoint of a bio-inspired framework for system flexibility.<n>We study the flexibility of NSGA-II, which we extend by two variants: 1) varying goals, which optimize solutions for two tasks simultaneously to obtain in-between source solutions expected to be more adaptable, and 2) active-inactive genotype, which accommodates different possibilities that can be activated or deactivated.
arXiv Detail & Related papers (2023-05-31T12:07:50Z) - Accelerated Federated Learning with Decoupled Adaptive Optimization [53.230515878096426]
federated learning (FL) framework enables clients to collaboratively learn a shared model while keeping privacy of training data on clients.
Recently, many iterations efforts have been made to generalize centralized adaptive optimization methods, such as SGDM, Adam, AdaGrad, etc., to federated settings.
This work aims to develop novel adaptive optimization methods for FL from the perspective of dynamics of ordinary differential equations (ODEs)
arXiv Detail & Related papers (2022-07-14T22:46:43Z) - Evolving Pareto-Optimal Actor-Critic Algorithms for Generalizability and
Stability [67.8426046908398]
Generalizability and stability are two key objectives for operating reinforcement learning (RL) agents in the real world.
This paper presents MetaPG, an evolutionary method for automated design of actor-critic loss functions.
arXiv Detail & Related papers (2022-04-08T20:46:16Z) - EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based
Models [53.17320541056843]
We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network.
Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
arXiv Detail & Related papers (2021-05-11T03:40:29Z) - A Novel Meta-Heuristic Optimization Algorithm Inspired by the Spread of
Viruses [0.0]
A novel nature-inspired meta-heuristic optimization algorithm called virus spread optimization (VSO) is proposed.
VSO loosely mimics the spread of viruses among hosts, and can be effectively applied to solving many challenging and continuous optimization problems.
arXiv Detail & Related papers (2020-06-11T09:35:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.