Bridging the Gap between Chemical Reaction Pretraining and Conditional
Molecule Generation with a Unified Model
- URL: http://arxiv.org/abs/2303.06965v5
- Date: Thu, 7 Mar 2024 14:51:12 GMT
- Title: Bridging the Gap between Chemical Reaction Pretraining and Conditional
Molecule Generation with a Unified Model
- Authors: Bo Qiang, Yiran Zhou, Yuheng Ding, Ningfeng Liu, Song Song, Liangren
Zhang, Bo Huang, Zhenming Liu
- Abstract summary: We propose a unified framework that addresses both the reaction representation learning and molecule generation tasks.
Inspired by the organic chemistry mechanism, we develop a novel pretraining framework that enables us to incorporate inductive biases into the model.
Our framework achieves state-of-the-art results on challenging downstream tasks.
- Score: 3.3031562864527664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Chemical reactions are the fundamental building blocks of drug design and
organic chemistry research. In recent years, there has been a growing need for
a large-scale deep-learning framework that can efficiently capture the basic
rules of chemical reactions. In this paper, we have proposed a unified
framework that addresses both the reaction representation learning and molecule
generation tasks, which allows for a more holistic approach. Inspired by the
organic chemistry mechanism, we develop a novel pretraining framework that
enables us to incorporate inductive biases into the model. Our framework
achieves state-of-the-art results on challenging downstream tasks. By
possessing chemical knowledge, our generative framework overcome the
limitations of current molecule generation models that rely on a small number
of reaction templates. In the extensive experiments, our model generates
synthesizable drug-like structures of high quality. Overall, our work presents
a significant step toward a large-scale deep-learning framework for a variety
of reaction-based applications.
Related papers
- GraphXForm: Graph transformer for computer-aided molecular design with application to extraction [73.1842164721868]
We present GraphXForm, a decoder-only graph transformer architecture, which is pretrained on existing compounds and then fine-tuned.
We evaluate it on two solvent design tasks for liquid-liquid extraction, showing that it outperforms four state-of-the-art molecular design techniques.
arXiv Detail & Related papers (2024-11-03T19:45:15Z) - BatGPT-Chem: A Foundation Large Model For Retrosynthesis Prediction [65.93303145891628]
BatGPT-Chem is a large language model with 15 billion parameters, tailored for enhanced retrosynthesis prediction.
Our model captures a broad spectrum of chemical knowledge, enabling precise prediction of reaction conditions.
This development empowers chemists to adeptly address novel compounds, potentially expediting the innovation cycle in drug manufacturing and materials science.
arXiv Detail & Related papers (2024-08-19T05:17:40Z) - UAlign: Pushing the Limit of Template-free Retrosynthesis Prediction with Unsupervised SMILES Alignment [51.49238426241974]
This paper introduces UAlign, a template-free graph-to-sequence pipeline for retrosynthesis prediction.
By combining graph neural networks and Transformers, our method can more effectively leverage the inherent graph structure of molecules.
arXiv Detail & Related papers (2024-03-25T03:23:03Z) - Contextual Molecule Representation Learning from Chemical Reaction
Knowledge [24.501564702095937]
We introduce REMO, a self-supervised learning framework that takes advantage of well-defined atom-combination rules in common chemistry.
REMO pre-trains graph/Transformer encoders on 1.7 million known chemical reactions in the literature.
arXiv Detail & Related papers (2024-02-21T12:58:40Z) - Holistic chemical evaluation reveals pitfalls in reaction prediction
models [0.3065062372337749]
We propose a new assessment scheme that builds on current approaches, steering towards a more holistic evaluation.
ChoRISO is a curated dataset along with multiple tailored splits to recreate chemically relevant scenarios.
Our work paves the way towards robust prediction models that can ultimately accelerate chemical discovery.
arXiv Detail & Related papers (2023-12-14T14:54:28Z) - PrefixMol: Target- and Chemistry-aware Molecule Design via Prefix
Embedding [34.27649279751879]
We develop a novel generative model that considers both the targeted pocket's circumstances and a variety of chemical properties.
Experiments show that our model exhibits good controllability in both single and multi-conditional molecular generation.
arXiv Detail & Related papers (2023-02-14T15:27:47Z) - Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation.
We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria.
Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z) - Deep Denerative Models for Drug Design and Response [0.0]
Recent success of deep generative modeling holds promises of generation and optimization of new molecules.
We present commonly used chemical and biological databases, and tools for generative modeling.
arXiv Detail & Related papers (2021-09-14T06:33:56Z) - Learning Graph Models for Retrosynthesis Prediction [90.15523831087269]
Retrosynthesis prediction is a fundamental problem in organic synthesis.
This paper introduces a graph-based approach that capitalizes on the idea that the graph topology of precursor molecules is largely unaltered during a chemical reaction.
Our model achieves a top-1 accuracy of $53.7%$, outperforming previous template-free and semi-template-based methods.
arXiv Detail & Related papers (2020-06-12T09:40:42Z) - Learning To Navigate The Synthetically Accessible Chemical Space Using
Reinforcement Learning [75.95376096628135]
We propose a novel forward synthesis framework powered by reinforcement learning (RL) for de novo drug design.
In this setup, the agent learns to navigate through the immense synthetically accessible chemical space.
We describe how the end-to-end training in this study represents an important paradigm in radically expanding the synthesizable chemical space.
arXiv Detail & Related papers (2020-04-26T21:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.