Difficulty in chirality recognition for Transformer architectures
learning chemical structures from string
- URL: http://arxiv.org/abs/2303.11593v4
- Date: Sun, 14 Jan 2024 00:18:44 GMT
- Title: Difficulty in chirality recognition for Transformer architectures
learning chemical structures from string
- Authors: Yasuhiro Yoshikai, Tadahaya Mizuno, Shumpei Nemoto, Hiroyuki Kusuhara
- Abstract summary: We investigate the relationship between the learning progress of SMILES and chemical structure using a representative NLP model, the Transformer.
We show that while the Transformer learns partial structures of molecules quickly, it requires extended training to understand overall structures.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent years have seen rapid development of descriptor generation based on
representation learning of extremely diverse molecules, especially those that
apply natural language processing (NLP) models to SMILES, a literal
representation of molecular structure. However, little research has been done
on how these models understand chemical structure. To address this black box,
we investigated the relationship between the learning progress of SMILES and
chemical structure using a representative NLP model, the Transformer. We show
that while the Transformer learns partial structures of molecules quickly, it
requires extended training to understand overall structures. Consistently, the
accuracy of molecular property predictions using descriptors generated from
models at different learning steps was similar from the beginning to the end of
training. Furthermore, we found that the Transformer requires particularly long
training to learn chirality and sometimes stagnates with low performance due to
misunderstanding of enantiomers. These findings are expected to deepen the
understanding of NLP models in chemistry.
Related papers
- Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model [50.756644656847165]
We introduce a multi-constraint molecular generation large language model, TSMMG, akin to a student.
To train TSMMG, we construct a large set of text-molecule pairs by extracting molecular knowledge from these 'teachers'
We experimentally show that TSMMG remarkably performs in generating molecules meeting complex, natural language-described property requirements.
arXiv Detail & Related papers (2024-03-20T02:15:55Z) - Empirical Evidence for the Fragment level Understanding on Drug
Molecular Structure of LLMs [16.508471997999496]
We investigate whether and how language models understand the chemical spatial structure from 1D sequences.
The results indicate that language models can understand chemical structures from the perspective of molecular fragments.
arXiv Detail & Related papers (2024-01-15T12:53:58Z) - From molecules to scaffolds to functional groups: building context-dependent molecular representation via multi-channel learning [10.025809630976065]
This paper introduces a novel pre-training framework that learns robust and generalizable chemical knowledge.
Our approach demonstrates competitive performance across various molecular property benchmarks.
arXiv Detail & Related papers (2023-11-05T23:47:52Z) - Towards Predicting Equilibrium Distributions for Molecular Systems with
Deep Learning [60.02391969049972]
We introduce a novel deep learning framework, called Distributional Graphormer (DiG), in an attempt to predict the equilibrium distribution of molecular systems.
DiG employs deep neural networks to transform a simple distribution towards the equilibrium distribution, conditioned on a descriptor of a molecular system.
arXiv Detail & Related papers (2023-06-08T17:12:08Z) - Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - MolCPT: Molecule Continuous Prompt Tuning to Generalize Molecular
Representation Learning [77.31492888819935]
We propose a novel paradigm of "pre-train, prompt, fine-tune" for molecular representation learning, named molecule continuous prompt tuning (MolCPT)
MolCPT defines a motif prompting function that uses the pre-trained model to project the standalone input into an expressive prompt.
Experiments on several benchmark datasets show that MolCPT efficiently generalizes pre-trained GNNs for molecular property prediction.
arXiv Detail & Related papers (2022-12-20T19:32:30Z) - Infusing Linguistic Knowledge of SMILES into Chemical Language Models [0.3655021726150368]
We grammatically parsed SMILES to obtain connectivity between substructures and their type, which is called the grammatical knowledge of SMILES.
Our representation model outperformed previous compound representations for the prediction of molecular properties.
arXiv Detail & Related papers (2022-04-20T01:25:18Z) - Geometric Transformer for End-to-End Molecule Properties Prediction [92.28929858529679]
We introduce a Transformer-based architecture for molecule property prediction, which is able to capture the geometry of the molecule.
We modify the classical positional encoder by an initial encoding of the molecule geometry, as well as a learned gated self-attention mechanism.
arXiv Detail & Related papers (2021-10-26T14:14:40Z) - GeoT: A Geometry-aware Transformer for Reliable Molecular Property
Prediction and Chemically Interpretable Representation Learning [16.484048833163282]
We introduce a novel Transformer-based framework for molecular representation learning, named the Geometry-aware Transformer (GeoT)
GeoT learns molecular graph structures through attention-based mechanisms specifically designed to offer reliable interpretability, as well as molecular property prediction.
Our comprehensive experiments, including an empirical simulation, reveal that GeoT effectively learns the chemical insights into molecular structures, bridging the gap between artificial intelligence and molecular sciences.
arXiv Detail & Related papers (2021-06-29T15:47:18Z) - Do Large Scale Molecular Language Representations Capture Important
Structural Information? [31.76876206167457]
We present molecular embeddings obtained by training an efficient transformer encoder model, referred to as MoLFormer.
Experiments show that the learned molecular representation performs competitively, when compared to graph-based and fingerprint-based supervised learning baselines.
arXiv Detail & Related papers (2021-06-17T14:33:55Z) - Learning Latent Space Energy-Based Prior Model for Molecule Generation [59.875533935578375]
We learn latent space energy-based prior model with SMILES representation for molecule modeling.
Our method is able to generate molecules with validity and uniqueness competitive with state-of-the-art models.
arXiv Detail & Related papers (2020-10-19T09:34:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.