A Genetic Algorithm for Navigating Synthesizable Molecular Spaces
- URL: http://arxiv.org/abs/2509.20719v1
- Date: Thu, 25 Sep 2025 03:33:30 GMT
- Title: A Genetic Algorithm for Navigating Synthesizable Molecular Spaces
- Authors: Alston Lo, Connor W. Coley, Wojciech Matusik,
- Abstract summary: We present SynGA, a simple genetic algorithm that operates directly over synthesis routes.<n>By modifying the fitness function, we demonstrate the effectiveness of SynGA on a variety of design tasks.
- Score: 34.11059107816963
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Inspired by the effectiveness of genetic algorithms and the importance of synthesizability in molecular design, we present SynGA, a simple genetic algorithm that operates directly over synthesis routes. Our method features custom crossover and mutation operators that explicitly constrain it to synthesizable molecular space. By modifying the fitness function, we demonstrate the effectiveness of SynGA on a variety of design tasks, including synthesizable analog search and sample-efficient property optimization, for both 2D and 3D objectives. Furthermore, by coupling SynGA with a machine learning-based filter that focuses the building block set, we boost SynGA to state-of-the-art performance. For property optimization, this manifests as a model-based variant SynGBO, which employs SynGA and block filtering in the inner loop of Bayesian optimization. Since SynGA is lightweight and enforces synthesizability by construction, our hope is that SynGA can not only serve as a strong standalone baseline but also as a versatile module that can be incorporated into larger synthesis-aware workflows in the future.
Related papers
- Subtractive Modulative Network with Learnable Periodic Activations [59.89799070130572]
We propose a novel, parameter-efficient Implicit Neural Representation architecture inspired by classical subtractive synthesis.<n>Our SMN achieves a PSNR of $40+$ dB on two image datasets, comparing favorably against state-of-the-art methods in terms of both reconstruction accuracy and parameter efficiency.
arXiv Detail & Related papers (2026-02-18T10:20:50Z) - Synthesizable by Design: A Retrosynthesis-Guided Framework for Molecular Analog Generation [0.5852077003870417]
We introduce SynTwins, a novel retrosynthesis-guided molecular analog design framework.<n>In comparative evaluations, SynTwins demonstrates superior performance in generating synthetically accessible analogs.<n>Our benchmarking across diverse molecular datasets demonstrates that SynTwins effectively bridges the gap between computational design and experimental synthesis.
arXiv Detail & Related papers (2025-07-03T16:14:57Z) - Scaling Laws of Synthetic Data for Language Models [132.67350443447611]
We introduce SynthLLM, a scalable framework that transforms pre-training corpora into diverse, high-quality synthetic datasets.<n>Our approach achieves this by automatically extracting and recombining high-level concepts across multiple documents using a graph algorithm.
arXiv Detail & Related papers (2025-03-25T11:07:12Z) - STAR: Synthesis of Tailored Architectures [61.080157488857516]
We propose a new approach for the synthesis of tailored architectures (STAR)<n>Our approach combines a novel search space based on the theory of linear input-varying systems, supporting a hierarchical numerical encoding into architecture genomes. STAR genomes are automatically refined and recombined with gradient-free, evolutionary algorithms to optimize for multiple model quality and efficiency metrics.<n>Using STAR, we optimize large populations of new architectures, leveraging diverse computational units and interconnection patterns, improving over highly-optimized Transformers and striped hybrid models on the frontier of quality, parameter size, and inference cache for autoregressive language modeling.
arXiv Detail & Related papers (2024-11-26T18:42:42Z) - Syno: Structured Synthesis for Neural Operators [1.5826646053411249]
We develop an end-to-end framework Syno, to realize practical neural operator synthesis.<n>We demonstrate that Syno discovers better operators with average speedups of $1.37times$ to $2.06times$ on various hardware and compiler choices.
arXiv Detail & Related papers (2024-10-31T09:00:24Z) - It Takes Two to Tango: Directly Optimizing for Constrained Synthesizability in Generative Molecular Design [0.4037357056611557]
Constrained synthesizability is an unaddressed challenge in generative molecular design.
We propose a novel reward function called TANimoto Group Overlap (TANGO)
TANGO transforms a sparse reward function into a dense and learnable reward function -- crucial for reinforcement learning.
arXiv Detail & Related papers (2024-10-15T11:59:51Z) - Procedural Synthesis of Synthesizable Molecules [22.905205379063148]
Design of synthetically accessible molecules and recommending analogs to unsynthesizable molecules are important problems for accelerating molecular discovery.<n>We reconceptualize both problems using ideas from program synthesis.<n>We create a bilevel framework for reasoning about the space of synthesis pathways.
arXiv Detail & Related papers (2024-08-24T04:32:36Z) - SynthAI: A Multi Agent Generative AI Framework for Automated Modular HLS Design Generation [0.0]
We introduce SynthAI, a new method for the automated creation of High-Level Synthesis (HLS) designs.
SynthAI integrates ReAct agents, Chain-of-Thought (CoT) prompting, web search technologies, and the Retrieval-Augmented Generation framework.
arXiv Detail & Related papers (2024-05-25T05:45:55Z) - SynthesizRR: Generating Diverse Datasets with Retrieval Augmentation [55.2480439325792]
We study the synthesis of six datasets, covering topic classification, sentiment analysis, tone detection, and humor.
We find that SynthesizRR greatly improves lexical and semantic diversity, similarity to human-written text, and distillation performance.
arXiv Detail & Related papers (2024-05-16T12:22:41Z) - Synthesizer: Rethinking Self-Attention in Transformer Models [93.08171885200922]
dot product self-attention is central and indispensable to state-of-the-art Transformer models.
This paper investigates the true importance and contribution of the dot product-based self-attention mechanism on the performance of Transformer models.
arXiv Detail & Related papers (2020-05-02T08:16:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.