Syntax-Guided Procedural Synthesis of Molecules
- URL: http://arxiv.org/abs/2409.05873v1
- Date: Sat, 24 Aug 2024 04:32:36 GMT
- Title: Syntax-Guided Procedural Synthesis of Molecules
- Authors: Michael Sun, Alston Lo, Wenhao Gao, Minghao Guo, Veronika Thost, Jie Chen, Connor Coley, Wojciech Matusik,
- Abstract summary: Design of synthetically accessible molecules and recommending analogs to unsynthesizable molecules are important problems for accelerating molecular discovery.
We reconceptualize both problems using ideas from program synthesis.
We decouple the syntactic skeleton from the semantics of a synthetic tree to create a bilevel framework for reasoning about the space of synthesis pathways.
- Score: 26.87587877386068
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Designing synthetically accessible molecules and recommending analogs to unsynthesizable molecules are important problems for accelerating molecular discovery. We reconceptualize both problems using ideas from program synthesis. Drawing inspiration from syntax-guided synthesis approaches, we decouple the syntactic skeleton from the semantics of a synthetic tree to create a bilevel framework for reasoning about the combinatorial space of synthesis pathways. Given a molecule we aim to generate analogs for, we iteratively refine its skeletal characteristics via Markov Chain Monte Carlo simulations over the space of syntactic skeletons. Given a black-box oracle to optimize, we formulate a joint design space over syntactic templates and molecular descriptors and introduce evolutionary algorithms that optimize both syntactic and semantic dimensions synergistically. Our key insight is that once the syntactic skeleton is set, we can amortize over the search complexity of deriving the program's semantics by training policies to fully utilize the fixed horizon Markov Decision Process imposed by the syntactic template. We demonstrate performance advantages of our bilevel framework for synthesizable analog generation and synthesizable molecule design. Notably, our approach offers the user explicit control over the resources required to perform synthesis and biases the design space towards simpler solutions, making it particularly promising for autonomous synthesis platforms.
Related papers
- Generative Artificial Intelligence for Navigating Synthesizable Chemical Space [25.65907958071386]
We introduce SynFormer, a generative modeling framework designed to efficiently explore and navigate synthesizable chemical space.
By incorporating a scalable transformer architecture and a diffusion module for building block selection, SynFormer surpasses existing models in synthesizable molecular design.
arXiv Detail & Related papers (2024-10-04T15:09:05Z) - SynthFormer: Equivariant Pharmacophore-based Generation of Molecules for Ligand-Based Drug Design [1.3927943269211591]
This paper addresses the gap between in silico generative approaches and practical in vitro methodologies.
We introduce SynthFormer, a novel ML model that utilizes a 3D equivariant encoder for pharmacophores to generate fully synthesizable molecules.
Our contributions include a new methodology for efficient chemical space exploration using 3D information, a novel architecture called Synthformer for translating 3D pharmacophore representations into molecules, and a meaningful embedding space that organizes reagents for drug discovery optimization.
arXiv Detail & Related papers (2024-10-03T17:38:46Z) - SynthesizRR: Generating Diverse Datasets with Retrieval Augmentation [55.2480439325792]
We study the synthesis of six datasets, covering topic classification, sentiment analysis, tone detection, and humor.
We find that SynthesizRR greatly improves lexical and semantic diversity, similarity to human-written text, and distillation performance.
arXiv Detail & Related papers (2024-05-16T12:22:41Z) - Genetic Algorithms for Searching a Matrix of Metagrammars for Synthesis [19.044613696320628]
Syntax-guided synthesis is a paradigm in which the search space of candidate solutions is constrained by a syntactic template in the form of a grammar.
In this work, we frame the space of syntactic templates as a matrix of rules, and demonstrate how this matrix can be searched effectively with little training data.
arXiv Detail & Related papers (2023-06-01T10:22:22Z) - FusionRetro: Molecule Representation Fusion via In-Context Learning for
Retrosynthetic Planning [58.47265392465442]
Retrosynthetic planning aims to devise a complete multi-step synthetic route from starting materials to a target molecule.
Current strategies use a decoupled approach of single-step retrosynthesis models and search algorithms.
We propose a novel framework that utilizes context information for improved retrosynthetic planning.
arXiv Detail & Related papers (2022-09-30T08:44:58Z) - ULSA: Unified Language of Synthesis Actions for Representation of
Synthesis Protocols [2.436060325115753]
We propose the first Unified Language of Synthesis Actions (ULSA) for describing synthesis procedures.
We created a dataset of 3,040 synthesis procedures annotated by domain experts according to the proposed ULSA scheme.
arXiv Detail & Related papers (2022-01-23T17:44:48Z) - Amortized Tree Generation for Bottom-up Synthesis Planning and
Synthesizable Molecular Design [2.17167311150369]
We report an amortized approach to generate synthetic pathways as a Markov decision process conditioned on a target molecular embedding.
This approach allows us to conduct synthesis planning in a bottom-up manner and design synthesizable molecules by decoding from optimized conditional codes.
arXiv Detail & Related papers (2021-10-12T22:43:25Z) - Representing Partial Programs with Blended Abstract Semantics [62.20775388513027]
We introduce a technique for representing partially written programs in a program synthesis engine.
We learn an approximate execution model implemented as a modular neural network.
We show that these hybrid neuro-symbolic representations enable execution-guided synthesizers to use more powerful language constructs.
arXiv Detail & Related papers (2020-12-23T20:40:18Z) - RetroXpert: Decompose Retrosynthesis Prediction like a Chemist [60.463900712314754]
We devise a novel template-free algorithm for automatic retrosynthetic expansion.
Our method disassembles retrosynthesis into two steps.
While outperforming the state-of-the-art baselines, our model also provides chemically reasonable interpretation.
arXiv Detail & Related papers (2020-11-04T04:35:34Z) - Learning Graph Models for Retrosynthesis Prediction [90.15523831087269]
Retrosynthesis prediction is a fundamental problem in organic synthesis.
This paper introduces a graph-based approach that capitalizes on the idea that the graph topology of precursor molecules is largely unaltered during a chemical reaction.
Our model achieves a top-1 accuracy of $53.7%$, outperforming previous template-free and semi-template-based methods.
arXiv Detail & Related papers (2020-06-12T09:40:42Z) - Retrosynthesis Prediction with Conditional Graph Logic Network [118.70437805407728]
Computer-aided retrosynthesis is finding renewed interest from both chemistry and computer science communities.
We propose a new approach to this task using the Conditional Graph Logic Network, a conditional graphical model built upon graph neural networks.
arXiv Detail & Related papers (2020-01-06T05:36:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.