LLamol: A Dynamic Multi-Conditional Generative Transformer for De Novo
Molecular Design
- URL: http://arxiv.org/abs/2311.14407v1
- Date: Fri, 24 Nov 2023 10:59:12 GMT
- Title: LLamol: A Dynamic Multi-Conditional Generative Transformer for De Novo
Molecular Design
- Authors: Niklas Dobberstein, Astrid Maass, Jan Hamaekers
- Abstract summary: "LLamol" is a single novel generative transformer model based on the LLama 2 architecture.
We demonstrate that the model adeptly handles single- and multi-conditional organic molecule generation with up to four conditions.
In detail, we showcase the model's capability to utilize token sequences for conditioning, either individually or in combination with numerical properties.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Generative models have demonstrated substantial promise in Natural Language
Processing (NLP) and have found application in designing molecules, as seen in
General Pretrained Transformer (GPT) models. In our efforts to develop such a
tool for exploring the organic chemical space in search of potentially
electro-active compounds, we present "LLamol", a single novel generative
transformer model based on the LLama 2 architecture, which was trained on a 13M
superset of organic compounds drawn from diverse public sources. To allow for a
maximum flexibility in usage and robustness in view of potentially incomplete
data, we introduce "Stochastic Context Learning" as a new training procedure.
We demonstrate that the resulting model adeptly handles single- and
multi-conditional organic molecule generation with up to four conditions, yet
more are possible. The model generates valid molecular structures in SMILES
notation while flexibly incorporating three numerical and/or one token sequence
into the generative process, just as requested. The generated compounds are
very satisfactory in all scenarios tested. In detail, we showcase the model's
capability to utilize token sequences for conditioning, either individually or
in combination with numerical properties, making LLamol a potent tool for de
novo molecule design, easily expandable with new properties.
Related papers
- Pre-trained Molecular Language Models with Random Functional Group Masking [54.900360309677794]
We propose a SMILES-based underlineem Molecular underlineem Language underlineem Model, which randomly masking SMILES subsequences corresponding to specific molecular atoms.
This technique aims to compel the model to better infer molecular structures and properties, thus enhancing its predictive capabilities.
arXiv Detail & Related papers (2024-11-03T01:56:15Z) - Generative Modeling of Molecular Dynamics Trajectories [12.255021091552441]
We introduce generative modeling of molecular trajectories as a paradigm for learning flexible multi-task surrogate models of MD from data.
We show such generative models can be adapted to diverse tasks such as forward simulation, transition path sampling, and trajectory upsampling.
arXiv Detail & Related papers (2024-09-26T13:02:28Z) - Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model [49.64512917330373]
We introduce a multi-constraint molecular generation large language model, TSMMG, akin to a student.
To train TSMMG, we construct a large set of text-molecule pairs by extracting molecular knowledge from these 'teachers'
We experimentally show that TSMMG remarkably performs in generating molecules meeting complex, natural language-described property requirements.
arXiv Detail & Related papers (2024-03-20T02:15:55Z) - A novel molecule generative model of VAE combined with Transformer for unseen structure generation [0.0]
Transformer and VAE are widely used as powerful models, but they are rarely used in combination due to structural and performance mismatch.
This study proposes a model that combines these two models through structural and parameter optimization in handling diverse molecules.
The proposed model shows comparable performance to existing models in generating molecules, and showed by far superior performance in generating molecules with unseen structures.
arXiv Detail & Related papers (2024-02-19T08:46:04Z) - An Equivariant Generative Framework for Molecular Graph-Structure
Co-Design [54.92529253182004]
We present MolCode, a machine learning-based generative framework for underlineMolecular graph-structure underlineCo-design.
In MolCode, 3D geometric information empowers the molecular 2D graph generation, which in turn helps guide the prediction of molecular 3D structure.
Our investigation reveals that the 2D topology and 3D geometry contain intrinsically complementary information in molecule design.
arXiv Detail & Related papers (2023-04-12T13:34:22Z) - PrefixMol: Target- and Chemistry-aware Molecule Design via Prefix
Embedding [34.27649279751879]
We develop a novel generative model that considers both the targeted pocket's circumstances and a variety of chemical properties.
Experiments show that our model exhibits good controllability in both single and multi-conditional molecular generation.
arXiv Detail & Related papers (2023-02-14T15:27:47Z) - Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation.
We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria.
Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z) - Augmenting Molecular Deep Generative Models with Topological Data
Analysis Representations [21.237758981760784]
We present a SMILES Variational Auto-Encoder (VAE) augmented with topological data analysis (TDA) representations of molecules.
Our experiments show that this TDA augmentation enables a SMILES VAE to capture the complex relation between 3D geometry and electronic properties.
arXiv Detail & Related papers (2021-06-08T15:49:21Z) - Learning Neural Generative Dynamics for Molecular Conformation
Generation [89.03173504444415]
We study how to generate molecule conformations (textiti.e., 3D structures) from a molecular graph.
We propose a novel probabilistic framework to generate valid and diverse conformations given a molecular graph.
arXiv Detail & Related papers (2021-02-20T03:17:58Z) - Learning Latent Space Energy-Based Prior Model for Molecule Generation [59.875533935578375]
We learn latent space energy-based prior model with SMILES representation for molecule modeling.
Our method is able to generate molecules with validity and uniqueness competitive with state-of-the-art models.
arXiv Detail & Related papers (2020-10-19T09:34:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.