Related papers: Improving Targeted Molecule Generation through Language Model Fine-Tuning Via Reinforcement Learning

Improving Targeted Molecule Generation through Language Model Fine-Tuning Via Reinforcement Learning

URL: http://arxiv.org/abs/2405.06836v1
Date: Fri, 10 May 2024 22:19:12 GMT
Title: Improving Targeted Molecule Generation through Language Model Fine-Tuning Via Reinforcement Learning
Authors: Salma J. Ahmed, Mustafa A. Elattar,
Abstract summary: We introduce an innovative de-novo drug design strategy, which harnesses the capabilities of language models to devise targeted drugs for specific proteins. Our method integrates a composite reward function, combining considerations of drug-target interaction and molecular validity.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Developing new drugs is laborious and costly, demanding extensive time investment. In this study, we introduce an innovative de-novo drug design strategy, which harnesses the capabilities of language models to devise targeted drugs for specific proteins. Employing a Reinforcement Learning (RL) framework utilizing Proximal Policy Optimization (PPO), we refine the model to acquire a policy for generating drugs tailored to protein targets. Our method integrates a composite reward function, combining considerations of drug-target interaction and molecular validity. Following RL fine-tuning, our approach demonstrates promising outcomes, yielding notable improvements in molecular validity, interaction efficacy, and critical chemical properties, achieving 65.37 for Quantitative Estimation of Drug-likeness (QED), 321.55 for Molecular Weight (MW), and 4.47 for Octanol-Water Partition Coefficient (logP), respectively. Furthermore, out of the generated drugs, only 0.041\% do not exhibit novelty.

Related papers

De Novo Molecular Design Enabled by Direct Preference Optimization and Curriculum Learning [0.0]
De novo molecular design has extensive applications in drug discovery and materials science. The vast chemical space renders direct molecular searches computationally prohibitive, while traditional experimental screening is both time- and labor-intensive. Direct Preference Optimization (DPO) from NLP uses molecular score-based sample pairs to maximize the likelihood difference between high- and low-quality molecules.
arXiv Detail & Related papers (2025-04-02T06:00:21Z)
DrugImproverGPT: A Large Language Model for Drug Optimization with Fine-Tuning via Structured Policy Optimization [53.27954325490941]
Finetuning a Large Language Model (LLM) is crucial for generating results towards specific objectives. This research introduces a novel reinforcement learning algorithm to finetune a drug optimization LLM-based generative model.
arXiv Detail & Related papers (2025-02-11T04:00:21Z)
DrugGen: Advancing Drug Discovery with Large Language Models and Reinforcement Learning Feedback [0.0]
DrugGen is an enhanced model based on the DrugGPT structure. It is fine-tuned on approved drug-target interactions and optimized with proximal policy optimization. By producing high-quality small molecules, DrugGen provides a high-performance medium for advancing pharmaceutical research and drug discovery.
arXiv Detail & Related papers (2024-11-20T01:21:07Z)
Fragment-Masked Molecular Optimization [37.20936761888007]
We propose a fragment-masked molecular optimization method based on phenotypic drug discovery (PDD) PDD-based molecular optimization can reduce potential safety risks while optimizing phenotypic activity, thereby increasing the likelihood of clinical success. The overall experiments demonstrate that the in-silico optimization success rate reaches 94.4%, with an average efficacy increase of 5.3%.
arXiv Detail & Related papers (2024-08-17T06:00:58Z)
Decomposed Direct Preference Optimization for Structure-Based Drug Design [47.561983733291804]
We propose DecompDPO, a structure-based optimization method to align diffusion models with pharmaceutical needs. DecompDPO can be effectively used for two main purposes: fine-tuning pretrained diffusion models for molecule generation across various protein families, and molecular optimization given a specific protein subpocket after generation.
arXiv Detail & Related papers (2024-07-19T02:12:25Z)
Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization [147.7899503829411]
AliDiff is a novel framework to align pretrained target diffusion models with preferred functional properties. It can generate molecules with state-of-the-art binding energies with up to -7.07 Avg. Vina Score.
arXiv Detail & Related papers (2024-07-01T06:10:29Z)
Latent Chemical Space Searching for Plug-in Multi-objective Molecule Generation [9.442146563809953]
We develop a versatile 'plug-in' molecular generation model that incorporates objectives related to target affinity, drug-likeness, and synthesizability. We identify PSO-ENP as the optimal variant for multi-objective molecular generation and optimization.
arXiv Detail & Related papers (2024-04-10T02:37:24Z)
Mol-AIR: Molecular Reinforcement Learning with Adaptive Intrinsic Rewards for Goal-directed Molecular Generation [0.0]
Mol-AIR is a reinforcement learning-based framework using adaptive intrinsic rewards for goal-directed molecular generation. In benchmark tests, Mol-AIR demonstrates superior performance over existing approaches in generating molecules with desired properties.
arXiv Detail & Related papers (2024-03-29T10:44:51Z)
Tailoring Molecules for Protein Pockets: a Transformer-based Generative Solution for Structured-based Drug Design [133.1268990638971]
De novo drug design based on the structure of a target protein can provide novel drug candidates. We present a generative solution named TamGent that can directly generate candidate drugs from scratch for a given target.
arXiv Detail & Related papers (2022-08-30T09:32:39Z)
Widely Used and Fast De Novo Drug Design by a Protein Sequence-Based Reinforcement Learning Model [4.815696666006742]
Structure-based de novo method can overcome the data scarcity of active by incorporating drug-target interaction into deep generative architectures. Here, we demonstrate a widely used and fast protein sequence-based reinforcement learning model for drug discovery. As a proof of concept, the RL model was utilized to design molecules for four targets.
arXiv Detail & Related papers (2022-08-14T10:41:52Z)
Molecular Attributes Transfer from Non-Parallel Data [57.010952598634944]
We formulate molecular optimization as a style transfer problem and present a novel generative model that could automatically learn internal differences between two groups of non-parallel data. Experiments on two molecular optimization tasks, toxicity modification and synthesizability improvement, demonstrate that our model significantly outperforms several state-of-the-art methods.
arXiv Detail & Related papers (2021-11-30T06:10:22Z)
Improved Drug-target Interaction Prediction with Intermolecular Graph Transformer [98.8319016075089]
We propose a novel approach to model intermolecular information with a three-way Transformer-based architecture. Intermolecular Graph Transformer (IGT) outperforms state-of-the-art approaches by 9.1% and 20.5% over the second best for binding activity and binding pose prediction respectively. IGT exhibits promising drug screening ability against SARS-CoV-2 by identifying 83.1% active drugs that have been validated by wet-lab experiments with near-native predicted binding poses.
arXiv Detail & Related papers (2021-10-14T13:28:02Z)
CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models [74.58583689523999]
We propose an end-to-end framework, named CogMol, for designing new drug-like small molecules targeting novel viral proteins. CogMol combines adaptive pre-training of a molecular SMILES Variational Autoencoder (VAE) and an efficient multi-attribute controlled sampling scheme. CogMol handles multi-constraint design of synthesizable, low-toxic, drug-like molecules with high target specificity and selectivity.
arXiv Detail & Related papers (2020-04-02T18:17:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.