A Self-feedback Knowledge Elicitation Approach for Chemical Reaction Predictions
- URL: http://arxiv.org/abs/2404.09606v1
- Date: Mon, 15 Apr 2024 09:26:33 GMT
- Title: A Self-feedback Knowledge Elicitation Approach for Chemical Reaction Predictions
- Authors: Pengfei Liu, Jun Tao, Zhixiang Ren,
- Abstract summary: We introduce a data-curated self-feedback knowledge elicitation approach.
We employ adaptive prompt learning to infuse the prior knowledge into the large language model.
This research offers a novel paradigm for knowledge elicitation in scientific research.
- Score: 24.80165173525286
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The task of chemical reaction predictions (CRPs) plays a pivotal role in advancing drug discovery and material science. However, its effectiveness is constrained by the vast and uncertain chemical reaction space and challenges in capturing reaction selectivity, particularly due to existing methods' limitations in exploiting the data's inherent knowledge. To address these challenges, we introduce a data-curated self-feedback knowledge elicitation approach. This method starts from iterative optimization of molecular representations and facilitates the extraction of knowledge on chemical reaction types (RTs). Then, we employ adaptive prompt learning to infuse the prior knowledge into the large language model (LLM). As a result, we achieve significant enhancements: a 14.2% increase in retrosynthesis prediction accuracy, a 74.2% rise in reagent prediction accuracy, and an expansion in the model's capability for handling multi-task chemical reactions. This research offers a novel paradigm for knowledge elicitation in scientific research and showcases the untapped potential of LLMs in CRPs.
Related papers
- ReactAIvate: A Deep Learning Approach to Predicting Reaction Mechanisms and Unmasking Reactivity Hotspots [4.362338454684645]
We develop an interpretable attention-based GNN that achieved near-unity and 96% accuracy for reaction step classification.
Our model adeptly identifies key atom(s) even from out-of-distribution classes.
This generalizabilty allows for the inclusion of new reaction types in a modular fashion, thus will be of value to experts for understanding the reactivity of new molecules.
arXiv Detail & Related papers (2024-07-14T05:53:18Z) - Retrosynthesis prediction enhanced by in-silico reaction data
augmentation [66.5643280109899]
We present RetroWISE, a framework that employs a base model inferred from real paired data to perform in-silico reaction generation and augmentation.
On three benchmark datasets, RetroWISE achieves the best overall performance against state-of-the-art models.
arXiv Detail & Related papers (2024-01-31T07:40:37Z) - Holistic chemical evaluation reveals pitfalls in reaction prediction
models [0.3065062372337749]
We propose a new assessment scheme that builds on current approaches, steering towards a more holistic evaluation.
ChoRISO is a curated dataset along with multiple tailored splits to recreate chemically relevant scenarios.
Our work paves the way towards robust prediction models that can ultimately accelerate chemical discovery.
arXiv Detail & Related papers (2023-12-14T14:54:28Z) - AI for Interpretable Chemistry: Predicting Radical Mechanistic Pathways
via Contrastive Learning [45.379791270351184]
RMechRP is a new deep learning-based reaction predictor system.
We develop and train models using RMechDB, a public database of radical reactions.
Our results demonstrate the effectiveness of RMechRP in providing accurate and interpretable predictions.
arXiv Detail & Related papers (2023-11-02T09:47:27Z) - Towards out-of-distribution generalizable predictions of chemical
kinetics properties [61.15970601264632]
Out-Of-Distribution (OOD) kinetic property prediction is required to be generalizable.
In this paper, we categorize the OOD kinetic property prediction into three levels (structure, condition, and mechanism)
We create comprehensive datasets to benchmark the state-of-the-art ML approaches for reaction prediction in the OOD setting and the state-of-the-art graph OOD methods in kinetics property prediction problems.
arXiv Detail & Related papers (2023-10-04T20:36:41Z) - ReactIE: Enhancing Chemical Reaction Extraction with Weak Supervision [27.850325653751078]
structured chemical reaction information plays a vital role for chemists engaged in laboratory work and advanced endeavors such as computer-aided drug design.
Despite the importance of extracting structured reactions from scientific literature, data annotation for this purpose is cost-prohibitive due to the significant labor required from domain experts.
We propose ReactIE, which combines two weakly supervised approaches for pre-training. Our method utilizes frequent patterns within the text as linguistic cues to identify specific characteristics of chemical reactions.
arXiv Detail & Related papers (2023-07-04T02:52:30Z) - A Unified View of Deep Learning for Reaction and Retrosynthesis
Prediction: Current Status and Future Challenges [59.41636061300571]
Reaction and retrosynthesis prediction are fundamental tasks in computational chemistry.
Various deep learning approaches have been proposed to tackle these problems.
This paper is the first comprehensive and systematic survey that seeks to provide a unified understanding of reaction and retrosynthesis prediction.
arXiv Detail & Related papers (2023-06-28T03:15:55Z) - MetaRF: Differentiable Random Forest for Reaction Yield Prediction with
a Few Trails [58.47364143304643]
In this paper, we focus on the reaction yield prediction problem.
We first put forth MetaRF, an attention-based differentiable random forest model specially designed for the few-shot yield prediction.
To improve the few-shot learning performance, we further introduce a dimension-reduction based sampling method.
arXiv Detail & Related papers (2022-08-22T06:40:13Z) - Improving Molecular Representation Learning with Metric
Learning-enhanced Optimal Transport [49.237577649802034]
We develop a novel optimal transport-based algorithm termed MROT to enhance their generalization capability for molecular regression problems.
MROT significantly outperforms state-of-the-art models, showing promising potential in accelerating the discovery of new substances.
arXiv Detail & Related papers (2022-02-13T04:56:18Z) - Dataset Bias in the Natural Sciences: A Case Study in Chemical Reaction
Prediction and Synthesis Design [0.8594140167290099]
We identify three trends within the fields of chemical reaction prediction and synthesis design that require a change in direction.
First, the manner in which reaction datasets are split into reactants and reagents encourages testing models in an unrealistically generous manner.
Second, we highlight the prevalence of mislabelled data, and suggest that the focus should be on outlier removal rather than data fitting only.
arXiv Detail & Related papers (2021-05-06T13:11:56Z) - Unassisted Noise Reduction of Chemical Reaction Data Sets [59.127921057012564]
We propose a machine learning-based, unassisted approach to remove chemically wrong entries from data sets.
Our results show an improved prediction quality for models trained on the cleaned and balanced data sets.
arXiv Detail & Related papers (2021-02-02T09:34:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.