Valid Property-Enhanced Contrastive Learning for Targeted Optimization & Resampling for Novel Drug Design
- URL: http://arxiv.org/abs/2509.00684v1
- Date: Sun, 31 Aug 2025 03:55:29 GMT
- Title: Valid Property-Enhanced Contrastive Learning for Targeted Optimization & Resampling for Novel Drug Design
- Authors: Amartya Banerjee, Somnath Kar, Anirban Pal, Debabrata Maiti,
- Abstract summary: VECTOR+ is a framework that couples property-guided representation learning with controllable molecule generation.<n>VECTOR+ generates novel, synthetically tractable candidates.<n>VECTOR+ generalizes to kinase inhibitors, producing compounds with stronger docking scores than established drugs.
- Score: 1.4874449172133888
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Efficiently steering generative models toward pharmacologically relevant regions of chemical space remains a major obstacle in molecular drug discovery under low-data regimes. We present VECTOR+: Valid-property-Enhanced Contrastive Learning for Targeted Optimization and Resampling, a framework that couples property-guided representation learning with controllable molecule generation. VECTOR+ applies to both regression and classification tasks and enables interpretable, data-efficient exploration of functional chemical space. We evaluate on two datasets: a curated PD-L1 inhibitor set (296 compounds with experimental $IC_{50}$ values) and a receptor kinase inhibitor set (2,056 molecules by binding mode). Despite limited training data, VECTOR+ generates novel, synthetically tractable candidates. Against PD-L1 (PDB 5J89), 100 of 8,374 generated molecules surpass a docking threshold of $-15.0$ kcal/mol, with the best scoring $-17.6$ kcal/mol compared to the top reference inhibitor ($-15.4$ kcal/mol). The best-performing molecules retain the conserved biphenyl pharmacophore while introducing novel motifs. Molecular dynamics (250 ns) confirm binding stability (ligand RMSD < $2.5$ angstroms). VECTOR+ generalizes to kinase inhibitors, producing compounds with stronger docking scores than established drugs such as brigatinib and sorafenib. Benchmarking against JT-VAE and MolGPT across docking, novelty, uniqueness, and Tanimoto similarity highlights the superior performance of our method. These results position our work as a robust, extensible approach for property-conditioned molecular design in low-data settings, bridging contrastive learning and generative modeling for reproducible, AI-accelerated discovery.
Related papers
- Fine-Tuning ChemBERTa for Predicting Inhibitory Activity Against TDP1 Using Deep Learning [0.0]
Predicting the potency of small molecules against Tyrosyl-DNA Phosphodiesterase 1 (TDP1) is a critical challenge in early drug discovery.<n>We present a deep learning framework for the quantitative regression of pIC50 values using fine-tuned variants of ChemBERTa.<n>Our approach outperforms classical baselines Random Predictor in both regression accuracy and virtual screening utility.
arXiv Detail & Related papers (2025-12-03T20:42:22Z) - ExMolRL: Phenotype-Target Joint Generation of De Novo Molecules via Multi-Objective Reinforcement Learning [4.998189068886174]
ExMoIRL is a novel generative framework that integrates phenotypic and target-specific cues for de novo molecular generation.<n>It fuses docking affinity and drug-likeness scores, augmented with ranking loss, prior-likelihood regularization, and entropy.<n>Extensive experiments demonstrate ExMoIRL's superior performance over state-of-the-art-based and target-based models.
arXiv Detail & Related papers (2025-09-25T11:13:24Z) - BAPULM: Binding Affinity Prediction using Language Models [7.136205674624813]
We introduce BAPULM, an innovative sequence-based framework that leverages the chemical latent representations of proteins via ProtT5-XL-U50 and through MolFormer.
Our approach was validated extensively on benchmark datasets, achieving sequential scoring power (R) values of 0.925 $pm$ 0.043, 0.914 $pm$ 0.004, and 0.8132 $pm$ 0.001 on benchmark1k2101, Test2016_290, and CSAR-HiQ_36, respectively.
arXiv Detail & Related papers (2024-11-06T04:35:30Z) - Manifold-Constrained Nucleus-Level Denoising Diffusion Model for Structure-Based Drug Design [81.95343363178662]
atoms must maintain a minimum pairwise distance to avoid separation violations.
NucleusDiff models the interactions between atomic nuclei and their surrounding electron clouds by enforcing the distance constraint.
It reduces violation rate by up to 1000% and enhances binding affinity by up to 22.16%, surpassing state-of-the-art models for structure-based drug design.
arXiv Detail & Related papers (2024-09-16T08:42:46Z) - Regressor-free Molecule Generation to Support Drug Response Prediction [83.25894107956735]
Conditional generation based on the target IC50 score can obtain a more effective sampling space.
Regressor-free guidance combines a diffusion model's score estimation with a regression controller model's gradient based on number labels.
arXiv Detail & Related papers (2024-05-23T13:22:17Z) - Drug Repurposing Targeting COVID-19 3CL Protease using Molecular Docking and Machine Learning Regression Approach [0.15346678870160887]
The COVID-19 pandemic has initiated a global health emergency, with an exigent need for effective cure.
We screened the 5903 approved drugs for their inhibition by targeting the main protease 3CL of SARS-CoV-2.
We employed several machine learning regression approaches for QSAR modeling to find out some potential drugs with high binding affinities.
arXiv Detail & Related papers (2023-05-25T05:34:39Z) - LIMO: Latent Inceptionism for Targeted Molecule Generation [14.391216237573369]
We present Latent Inceptionism on Molecules (LIMO), which significantly accelerates molecule generation with an inceptionism-like technique.
Comprehensive experiments show that LIMO performs competitively on benchmark tasks.
One of our generated drug-like compounds has a predicted $K_D$ of $6 cdot 10-14$ M against the human estrogen receptor.
arXiv Detail & Related papers (2022-06-17T21:05:58Z) - Improved Drug-target Interaction Prediction with Intermolecular Graph
Transformer [98.8319016075089]
We propose a novel approach to model intermolecular information with a three-way Transformer-based architecture.
Intermolecular Graph Transformer (IGT) outperforms state-of-the-art approaches by 9.1% and 20.5% over the second best for binding activity and binding pose prediction respectively.
IGT exhibits promising drug screening ability against SARS-CoV-2 by identifying 83.1% active drugs that have been validated by wet-lab experiments with near-native predicted binding poses.
arXiv Detail & Related papers (2021-10-14T13:28:02Z) - Benchmarking Deep Graph Generative Models for Optimizing New Drug
Molecules for COVID-19 [11.853524110656991]
Design of new drug compounds with target properties is a key area of research in generative modeling.
We present a small drug molecule design pipeline based on graph-generative models and a comparison study of two state-of-the-art graph generative models for designing COVID-19 targeted drug candidates.
arXiv Detail & Related papers (2021-02-09T17:49:26Z) - Optimizing Molecules using Efficient Queries from Property Evaluations [66.66290256377376]
We propose QMO, a generic query-based molecule optimization framework.
QMO improves the desired properties of an input molecule based on efficient queries.
We show that QMO outperforms existing methods in the benchmark tasks of optimizing small organic molecules.
arXiv Detail & Related papers (2020-11-03T18:51:18Z) - Accelerating Antimicrobial Discovery with Controllable Deep Generative
Models and Molecular Dynamics [109.70543391923344]
CLaSS (Controlled Latent attribute Space Sampling) is an efficient computational method for attribute-controlled generation of molecules.
We screen the generated molecules for additional key attributes by using deep learning classifiers in conjunction with novel features derived from atomistic simulations.
The proposed approach is demonstrated for designing non-toxic antimicrobial peptides (AMPs) with strong broad-spectrum potency.
arXiv Detail & Related papers (2020-05-22T15:57:58Z) - CogMol: Target-Specific and Selective Drug Design for COVID-19 Using
Deep Generative Models [74.58583689523999]
We propose an end-to-end framework, named CogMol, for designing new drug-like small molecules targeting novel viral proteins.
CogMol combines adaptive pre-training of a molecular SMILES Variational Autoencoder (VAE) and an efficient multi-attribute controlled sampling scheme.
CogMol handles multi-constraint design of synthesizable, low-toxic, drug-like molecules with high target specificity and selectivity.
arXiv Detail & Related papers (2020-04-02T18:17:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.