Evaluating Self-Supervised Learning for Molecular Graph Embeddings
- URL: http://arxiv.org/abs/2206.08005v3
- Date: Wed, 18 Oct 2023 08:51:16 GMT
- Title: Evaluating Self-Supervised Learning for Molecular Graph Embeddings
- Authors: Hanchen Wang, Jean Kaddour, Shengchao Liu, Jian Tang, Joan Lasenby, Qi
Liu
- Abstract summary: Graph Self-Supervised Learning (GSSL) provides a robust pathway for acquiring embeddings without expert labelling.
"MOLGRAPHEVAL" generates detailed profiles of molecular graph embeddings with interpretable and diversified attributes.
- Score: 38.65102126919387
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Graph Self-Supervised Learning (GSSL) provides a robust pathway for acquiring
embeddings without expert labelling, a capability that carries profound
implications for molecular graphs due to the staggering number of potential
molecules and the high cost of obtaining labels. However, GSSL methods are
designed not for optimisation within a specific domain but rather for
transferability across a variety of downstream tasks. This broad applicability
complicates their evaluation. Addressing this challenge, we present "Molecular
Graph Representation Evaluation" (MOLGRAPHEVAL), generating detailed profiles
of molecular graph embeddings with interpretable and diversified attributes.
MOLGRAPHEVAL offers a suite of probing tasks grouped into three categories: (i)
generic graph, (ii) molecular substructure, and (iii) embedding space
properties. By leveraging MOLGRAPHEVAL to benchmark existing GSSL methods
against both current downstream datasets and our suite of tasks, we uncover
significant inconsistencies between inferences drawn solely from existing
datasets and those derived from more nuanced probing. These findings suggest
that current evaluation methodologies fail to capture the entirety of the
landscape.
Related papers
- The Heterophilic Graph Learning Handbook: Benchmarks, Models, Theoretical Analysis, Applications and Challenges [101.83124435649358]
Homophily principle, ie nodes with the same labels or similar attributes are more likely to be connected.
Recent work has identified a non-trivial set of datasets where GNN's performance compared to the NN's is not satisfactory.
arXiv Detail & Related papers (2024-07-12T18:04:32Z) - Improving Self-supervised Molecular Representation Learning using
Persistent Homology [6.263470141349622]
Self-supervised learning (SSL) has great potential for molecular representation learning.
In this paper, we study SSL based on persistent homology (PH), a mathematical tool for modeling topological features of data that persist across multiple scales.
arXiv Detail & Related papers (2023-11-29T02:58:30Z) - GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule
Zero-Shot Learning [71.89623260998934]
This study investigates the feasibility of employing natural language instructions to accomplish molecule-related tasks in a zero-shot setting.
Existing molecule-text models perform poorly in this setting due to inadequate treatment of instructions and limited capacity for graphs.
We propose GIMLET, which unifies language models for both graph and text data.
arXiv Detail & Related papers (2023-05-28T18:27:59Z) - Analyzing Data-Centric Properties for Contrastive Learning on Graphs [32.69353929886551]
We investigate how do graph SSL methods, such as contrastive learning (CL), work well?
Our work rigorously contextualizes, both empirically and theoretically, the effects of data-centric properties on augmentation strategies and learning paradigms for graph SSL.
arXiv Detail & Related papers (2022-08-04T17:58:37Z) - Heterogeneous Graph Neural Networks using Self-supervised Reciprocally
Contrastive Learning [102.9138736545956]
Heterogeneous graph neural network (HGNN) is a very popular technique for the modeling and analysis of heterogeneous graphs.
We develop for the first time a novel and robust heterogeneous graph contrastive learning approach, namely HGCL, which introduces two views on respective guidance of node attributes and graph topologies.
In this new approach, we adopt distinct but most suitable attribute and topology fusion mechanisms in the two views, which are conducive to mining relevant information in attributes and topologies separately.
arXiv Detail & Related papers (2022-04-30T12:57:02Z) - Motif-based Graph Self-Supervised Learning forMolecular Property
Prediction [12.789013658551454]
Graph Neural Networks (GNNs) have demonstrated remarkable success in various molecular generation and prediction tasks.
Most existing self-supervised pre-training frameworks for GNNs only focus on node-level or graph-level tasks.
We propose a novel self-supervised motif generation framework for GNNs.
arXiv Detail & Related papers (2021-10-03T11:45:51Z) - Weakly-supervised Graph Meta-learning for Few-shot Node Classification [53.36828125138149]
We propose a new graph meta-learning framework -- Graph Hallucination Networks (Meta-GHN)
Based on a new robustness-enhanced episodic training, Meta-GHN is meta-learned to hallucinate clean node representations from weakly-labeled data.
Extensive experiments demonstrate the superiority of Meta-GHN over existing graph meta-learning studies.
arXiv Detail & Related papers (2021-06-12T22:22:10Z) - Structure-Enhanced Meta-Learning For Few-Shot Graph Classification [53.54066611743269]
This work explores the potential of metric-based meta-learning for solving few-shot graph classification.
An implementation upon GIN, named SMFGIN, is tested on two datasets, Chembl and TRIANGLES.
arXiv Detail & Related papers (2021-03-05T09:03:03Z) - Quantifying Challenges in the Application of Graph Representation
Learning [0.0]
We provide an application oriented perspective to a set of popular embedding approaches.
We evaluate their representational power with respect to real-world graph properties.
Our results suggest that "one-to-fit-all" GRL approaches are hard to define in real-world scenarios.
arXiv Detail & Related papers (2020-06-18T03:19:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.