Semi-Supervised Junction Tree Variational Autoencoder for Molecular
Property Prediction
- URL: http://arxiv.org/abs/2208.05119v1
- Date: Wed, 10 Aug 2022 03:06:58 GMT
- Title: Semi-Supervised Junction Tree Variational Autoencoder for Molecular
Property Prediction
- Authors: Tongzhou Shen
- Abstract summary: This research modifies state-of-the-art molecule generation method - Junction Tree Variational Autoencoder (JT-VAE) to facilitate semi-supervised learning on chemical property prediction.
We leverage JT-VAE architecture to learn an interpretable representation optimal for tasks ranging from molecule property prediction to conditional molecule generation.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in machine learning have enabled accurate prediction of
chemical properties. However, supervised machine learning methods in this
domain often suffer from the label scarcity problem, due to the expensive
nature of labeling chemical property experimentally. This research modifies
state-of-the-art molecule generation method - Junction Tree Variational
Autoencoder (JT-VAE) to facilitate semi-supervised learning on chemical
property prediction. Furthermore, we force some latent variables to take on
consistent and interpretable purposes such as representing toxicity via this
partial supervision. We leverage JT-VAE architecture to learn an interpretable
representation optimal for tasks ranging from molecule property prediction to
conditional molecule generation, using a partially labelled dataset.
Related papers
- Pre-trained Molecular Language Models with Random Functional Group Masking [54.900360309677794]
We propose a SMILES-based underlineem Molecular underlineem Language underlineem Model, which randomly masking SMILES subsequences corresponding to specific molecular atoms.
This technique aims to compel the model to better infer molecular structures and properties, thus enhancing its predictive capabilities.
arXiv Detail & Related papers (2024-11-03T01:56:15Z) - Stacked ensemble\-based mutagenicity prediction model using multiple modalities with graph attention network [0.9736758288065405]
Mutagenicity is a concern due to its association with genetic mutations which can result in a variety of negative consequences.
In this work, we introduce a novel stacked ensemble based mutagenicity prediction model.
arXiv Detail & Related papers (2024-09-03T09:14:21Z) - Molecule Design by Latent Prompt Transformer [76.2112075557233]
This work explores the challenging problem of molecule design by framing it as a conditional generative modeling task.
We propose a novel generative model comprising three components: (1) a latent vector with a learnable prior distribution; (2) a molecule generation model based on a causal Transformer, which uses the latent vector as a prompt; and (3) a property prediction model that predicts a molecule's target properties and/or constraint values using the latent prompt.
arXiv Detail & Related papers (2024-02-27T03:33:23Z) - Towards out-of-distribution generalizable predictions of chemical
kinetics properties [61.15970601264632]
Out-Of-Distribution (OOD) kinetic property prediction is required to be generalizable.
In this paper, we categorize the OOD kinetic property prediction into three levels (structure, condition, and mechanism)
We create comprehensive datasets to benchmark the state-of-the-art ML approaches for reaction prediction in the OOD setting and the state-of-the-art graph OOD methods in kinetics property prediction problems.
arXiv Detail & Related papers (2023-10-04T20:36:41Z) - Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - Graph neural networks for the prediction of molecular structure-property
relationships [59.11160990637615]
Graph neural networks (GNNs) are a novel machine learning method that directly work on the molecular graph.
GNNs allow to learn properties in an end-to-end fashion, thereby avoiding the need for informative descriptors.
We describe the fundamentals of GNNs and demonstrate the application of GNNs via two examples for molecular property prediction.
arXiv Detail & Related papers (2022-07-25T11:30:44Z) - Automatic Identification of Chemical Moieties [11.50343898633327]
We introduce a method to automatically identify chemical moieties from atomic representations using message-passing neural networks.
The versatility of our approach is demonstrated by enabling the selection of representative entries in chemical databases.
arXiv Detail & Related papers (2022-03-30T10:58:23Z) - Improving Molecular Representation Learning with Metric
Learning-enhanced Optimal Transport [49.237577649802034]
We develop a novel optimal transport-based algorithm termed MROT to enhance their generalization capability for molecular regression problems.
MROT significantly outperforms state-of-the-art models, showing promising potential in accelerating the discovery of new substances.
arXiv Detail & Related papers (2022-02-13T04:56:18Z) - Improving VAE based molecular representations for compound property
prediction [0.0]
We propose a simple method to improve chemical property prediction performance of machine learning models.
We show the relation between the performance of property prediction models and the distance between property prediction dataset and the larger unlabeled dataset.
arXiv Detail & Related papers (2022-01-13T12:57:11Z) - Do Large Scale Molecular Language Representations Capture Important
Structural Information? [31.76876206167457]
We present molecular embeddings obtained by training an efficient transformer encoder model, referred to as MoLFormer.
Experiments show that the learned molecular representation performs competitively, when compared to graph-based and fingerprint-based supervised learning baselines.
arXiv Detail & Related papers (2021-06-17T14:33:55Z) - MEG: Generating Molecular Counterfactual Explanations for Deep Graph
Networks [11.291571222801027]
We present a novel approach to tackle explainability of deep graph networks in the context of molecule property prediction t asks.
We generate informative counterfactual explanations for a specific prediction under the form of (valid) compounds with high structural similarity and different predicted properties.
We discuss the results showing how the model can convey non-ML experts with key insights into the learning model focus in the neighbourhood of a molecule.
arXiv Detail & Related papers (2021-04-16T12:17:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.