Molecular Property Prediction by Semantic-invariant Contrastive Learning
- URL: http://arxiv.org/abs/2303.06902v1
- Date: Mon, 13 Mar 2023 07:32:37 GMT
- Title: Molecular Property Prediction by Semantic-invariant Contrastive Learning
- Authors: Ziqiao Zhang, Ailin Xie, Jihong Guan, Shuigeng Zhou
- Abstract summary: We develop a Fragment-based Semantic-Invariant Contrastive Learning model based on this view generation method for molecular property prediction.
With the least number of pre-training samples, FraSICL can achieve state-of-the-art performance, compared with major existing counterpart models.
- Score: 26.19431931932982
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contrastive learning have been widely used as pretext tasks for
self-supervised pre-trained molecular representation learning models in
AI-aided drug design and discovery. However, exiting methods that generate
molecular views by noise-adding operations for contrastive learning may face
the semantic inconsistency problem, which leads to false positive pairs and
consequently poor prediction performance. To address this problem, in this
paper we first propose a semantic-invariant view generation method by properly
breaking molecular graphs into fragment pairs. Then, we develop a
Fragment-based Semantic-Invariant Contrastive Learning (FraSICL) model based on
this view generation method for molecular property prediction. The FraSICL
model consists of two branches to generate representations of views for
contrastive learning, meanwhile a multi-view fusion and an auxiliary similarity
loss are introduced to make better use of the information contained in
different fragment-pair views. Extensive experiments on various benchmark
datasets show that with the least number of pre-training samples, FraSICL can
achieve state-of-the-art performance, compared with major existing counterpart
models.
Related papers
- Revealing Multimodal Contrastive Representation Learning through Latent
Partial Causal Models [85.67870425656368]
We introduce a unified causal model specifically designed for multimodal data.
We show that multimodal contrastive representation learning excels at identifying latent coupled variables.
Experiments demonstrate the robustness of our findings, even when the assumptions are violated.
arXiv Detail & Related papers (2024-02-09T07:18:06Z) - Learning Invariant Molecular Representation in Latent Discrete Space [52.13724532622099]
We propose a new framework for learning molecular representations that exhibit invariance and robustness against distribution shifts.
Our model achieves stronger generalization against state-of-the-art baselines in the presence of various distribution shifts.
arXiv Detail & Related papers (2023-10-22T04:06:44Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Attribute Graphs Underlying Molecular Generative Models: Path to Learning with Limited Data [42.517927809224275]
We provide an algorithm that relies on perturbation experiments on latent codes of a pre-trained generative autoencoder to uncover an attribute graph.
We show that one can fit an effective graphical model that models a structural equation model between latent codes.
Using a pre-trained generative autoencoder trained on a large dataset of small molecules, we demonstrate that the graphical model can be used to predict a specific property.
arXiv Detail & Related papers (2022-07-14T19:20:30Z) - Prototypical Representation Learning for Relation Extraction [56.501332067073065]
This paper aims to learn predictive, interpretable, and robust relation representations from distantly-labeled data.
We learn prototypes for each relation from contextual information to best explore the intrinsic semantics of relations.
Results on several relation learning tasks show that our model significantly outperforms the previous state-of-the-art relational models.
arXiv Detail & Related papers (2021-03-22T08:11:43Z) - Model-agnostic interpretation by visualization of feature perturbations [0.0]
We propose a model-agnostic interpretation approach that uses visualization of feature perturbations induced by the particle swarm optimization algorithm.
We validate our approach both qualitatively and quantitatively on publicly available datasets.
arXiv Detail & Related papers (2021-01-26T00:53:29Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Predicting Chemical Properties using Self-Attention Multi-task Learning
based on SMILES Representation [0.0]
In this study, we explore the structural differences of the transformer-variant model and proposed a new self-attention based model.
The representation learning performance of the self-attention module was evaluated in a multi-task learning environment using imbalanced chemical datasets.
arXiv Detail & Related papers (2020-10-19T09:46:50Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.