DiffSyn: A Generative Diffusion Approach to Materials Synthesis Planning
- URL: http://arxiv.org/abs/2509.17094v2
- Date: Thu, 25 Sep 2025 01:06:52 GMT
- Title: DiffSyn: A Generative Diffusion Approach to Materials Synthesis Planning
- Authors: Elton Pan, Soonhyoung Kwon, Sulin Liu, Mingrou Xie, Alexander J. Hoffman, Yifei Duan, Thorben Prein, Killian Sheriff, Yuriy Roman-Leshkov, Manuel Moliner, Rafael Gomez-Bombarelli, Elsa Olivetti,
- Abstract summary: We propose DiffSyn, a generative diffusion model trained on over 23,000 synthesis recipes spanning 50 years of literature.<n> DiffSyn generates probable synthesis routes conditioned on a desired zeolite structure and an organic template.<n>As a proof of concept, we synthesize a UFI material using DiffSyn-generated synthesis routes.
- Score: 31.640096031422896
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The synthesis of crystalline materials, such as zeolites, remains a significant challenge due to a high-dimensional synthesis space, intricate structure-synthesis relationships and time-consuming experiments. Considering the one-to-many relationship between structure and synthesis, we propose DiffSyn, a generative diffusion model trained on over 23,000 synthesis recipes spanning 50 years of literature. DiffSyn generates probable synthesis routes conditioned on a desired zeolite structure and an organic template. DiffSyn achieves state-of-the-art performance by capturing the multi-modal nature of structure-synthesis relationships. We apply DiffSyn to differentiate among competing phases and generate optimal synthesis routes. As a proof of concept, we synthesize a UFI material using DiffSyn-generated synthesis routes. These routes, rationalized by density functional theory binding energies, resulted in the successful synthesis of a UFI material with a high Si/Al$_{\text{ICP}}$ of 19.0, which is expected to improve thermal stability and is higher than that of any previously recorded.
Related papers
- Synthelite: Chemist-aligned and feasibility-aware synthesis planning with LLMs [3.7129661557601854]
We introduce Synthelite, a synthesis planning framework that uses large language models to propose retrosynthetic transformations.<n> Synthelite can generate end-to-end synthesis routes by harnessing the intrinsic chemical knowledge and reasoning capabilities of LLMs.<n>Our experiments demonstrate that Synthelite can flexibly adapt its planning trajectory to diverse user-specified constraints, achieving up to 95% success rates.
arXiv Detail & Related papers (2025-12-18T11:24:30Z) - LeMat-Synth: a multi-modal toolbox to curate broad synthesis procedure databases from scientific literature [60.879220305044726]
We propose a multi-modal toolbox that employs large language models (LLMs) and vision language models (VLMs) to automatically extract and organize synthesis procedures and performance data.<n>We curated 81k open-access papers, yielding LeMat- Synth (v 1.0): a dataset containing synthesis procedures spanning 35 synthesis methods and 16 material classes.<n>We release a modular, open-source library designed to support community-driven extension to new corpora and synthesis domains.
arXiv Detail & Related papers (2025-10-28T17:58:18Z) - Rethinking Molecule Synthesizability with Chain-of-Reaction [47.744071119775676]
We introduce ReaSyn, a generative framework for synthesizable projection.<n>We propose a novel perspective that views synthetic pathways akin to reasoning paths in large language models (LLMs)<n>With the CoR notation, ReaSyn can get dense supervision in every reaction step to explicitly learn chemical reaction rules.
arXiv Detail & Related papers (2025-09-19T15:29:57Z) - Synthesizable by Design: A Retrosynthesis-Guided Framework for Molecular Analog Generation [0.5852077003870417]
We introduce SynTwins, a novel retrosynthesis-guided molecular analog design framework.<n>In comparative evaluations, SynTwins demonstrates superior performance in generating synthetically accessible analogs.<n>Our benchmarking across diverse molecular datasets demonstrates that SynTwins effectively bridges the gap between computational design and experimental synthesis.
arXiv Detail & Related papers (2025-07-03T16:14:57Z) - Autonomous nanoparticle synthesis by design [32.63291717930695]
We introduce an autonomous approach explicitly targeting synthesis of atomic-scale structures.<n>Our method autonomously designs synthesis protocols by matching real time experimental total scattering (TS) and pair distribution function (PDF) data.<n>We demonstrate this capability at a synchrotron, successfully synthesising two structurally distinct gold NPs.
arXiv Detail & Related papers (2025-05-19T13:19:30Z) - Scaling Laws of Synthetic Data for Language Models [132.67350443447611]
We introduce SynthLLM, a scalable framework that transforms pre-training corpora into diverse, high-quality synthetic datasets.<n>Our approach achieves this by automatically extracting and recombining high-level concepts across multiple documents using a graph algorithm.
arXiv Detail & Related papers (2025-03-25T11:07:12Z) - Procedural Synthesis of Synthesizable Molecules [22.905205379063148]
Design of synthetically accessible molecules and recommending analogs to unsynthesizable molecules are important problems for accelerating molecular discovery.<n>We reconceptualize both problems using ideas from program synthesis.<n>We create a bilevel framework for reasoning about the space of synthesis pathways.
arXiv Detail & Related papers (2024-08-24T04:32:36Z) - SynthesizRR: Generating Diverse Datasets with Retrieval Augmentation [55.2480439325792]
We study the synthesis of six datasets, covering topic classification, sentiment analysis, tone detection, and humor.
We find that SynthesizRR greatly improves lexical and semantic diversity, similarity to human-written text, and distillation performance.
arXiv Detail & Related papers (2024-05-16T12:22:41Z) - Extracting Structured Seed-Mediated Gold Nanorod Growth Procedures from
Literature with GPT-3 [52.59930033705221]
We present a dataset of 11,644 entities extracted from 1,137 papers, resulting in 268 papers with at least one complete seed-mediated gold nanorod growth procedure and outcome for a total of 332 complete procedures.
We present a dataset of 11,644 entities extracted from 1,137 papers, resulting in papers with at least one complete seed-mediated gold nanorod growth procedure and outcome for a total of 332 complete procedures.
arXiv Detail & Related papers (2023-04-26T22:21:33Z) - FusionRetro: Molecule Representation Fusion via In-Context Learning for
Retrosynthetic Planning [58.47265392465442]
Retrosynthetic planning aims to devise a complete multi-step synthetic route from starting materials to a target molecule.
Current strategies use a decoupled approach of single-step retrosynthesis models and search algorithms.
We propose a novel framework that utilizes context information for improved retrosynthetic planning.
arXiv Detail & Related papers (2022-09-30T08:44:58Z) - ULSA: Unified Language of Synthesis Actions for Representation of
Synthesis Protocols [2.436060325115753]
We propose the first Unified Language of Synthesis Actions (ULSA) for describing synthesis procedures.
We created a dataset of 3,040 synthesis procedures annotated by domain experts according to the proposed ULSA scheme.
arXiv Detail & Related papers (2022-01-23T17:44:48Z) - Program Synthesis as Dependency Quantified Formula Modulo Theory [21.817030743512568]
This paper investigates the feasibility of synthesis techniques without grammar.
We show that $mathbbT$-constrained synthesis can be reduced to DQF($mathbbT$), i.e., to the problem of finding a witness of a Dependency Quantified Formula Modulo Theory.
arXiv Detail & Related papers (2021-05-19T16:05:20Z) - Predictive Synthesis of Quantum Materials by Probabilistic Reinforcement
Learning [1.4680035572775534]
We use reinforcement learning to predict optimal synthesis schedules for a prototypical quantum material, semiconducting monolayer MoS$_2$.
The model can be extended to predict profiles for synthesis of complex structures including multi-phase heterostructures.
arXiv Detail & Related papers (2020-09-14T20:50:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.