Data Transfer Approaches to Improve Seq-to-Seq Retrosynthesis
- URL: http://arxiv.org/abs/2010.00792v1
- Date: Fri, 2 Oct 2020 05:27:51 GMT
- Title: Data Transfer Approaches to Improve Seq-to-Seq Retrosynthesis
- Authors: Katsuhiko Ishiguro, Kazuya Ujihara, Ryohto Sawada, Hirotaka Akita,
Masaaki Kotera
- Abstract summary: Retrosynthesis is a problem to infer reactant compounds to synthesize a given product compound through chemical reactions.
Recent studies on retrosynthesis focus on proposing more sophisticated prediction models.
The dataset to feed the models also plays an essential role in achieving the best generalizing models.
- Score: 1.6449390849183363
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrosynthesis is a problem to infer reactant compounds to synthesize a given
product compound through chemical reactions. Recent studies on retrosynthesis
focus on proposing more sophisticated prediction models, but the dataset to
feed the models also plays an essential role in achieving the best generalizing
models. Generally, a dataset that is best suited for a specific task tends to
be small. In such a case, it is the standard solution to transfer knowledge
from a large or clean dataset in the same domain. In this paper, we conduct a
systematic and intensive examination of data transfer approaches on end-to-end
generative models, in application to retrosynthesis. Experimental results show
that typical data transfer methods can improve test prediction scores of an
off-the-shelf Transformer baseline model. Especially, the pre-training plus
fine-tuning approach boosts the accuracy scores of the baseline, achieving the
new state-of-the-art. In addition, we conduct a manual inspection for the
erroneous prediction results. The inspection shows that the pre-training plus
fine-tuning models can generate chemically appropriate or sensible proposals in
almost all cases.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.