Analyzing Data-Centric Properties for Contrastive Learning on Graphs
- URL: http://arxiv.org/abs/2208.02810v1
- Date: Thu, 4 Aug 2022 17:58:37 GMT
- Title: Analyzing Data-Centric Properties for Contrastive Learning on Graphs
- Authors: Puja Trivedi, Ekdeep Singh Lubana, Mark Heimann, Danai Koutra,
Jayaraman J. Thiagarajan
- Abstract summary: We investigate how do graph SSL methods, such as contrastive learning (CL), work well?
Our work rigorously contextualizes, both empirically and theoretically, the effects of data-centric properties on augmentation strategies and learning paradigms for graph SSL.
- Score: 32.69353929886551
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent analyses of self-supervised learning (SSL) find the following
data-centric properties to be critical for learning good representations:
invariance to task-irrelevant semantics, separability of classes in some latent
space, and recoverability of labels from augmented samples. However, given
their discrete, non-Euclidean nature, graph datasets and graph SSL methods are
unlikely to satisfy these properties. This raises the question: how do graph
SSL methods, such as contrastive learning (CL), work well? To systematically
probe this question, we perform a generalization analysis for CL when using
generic graph augmentations (GGAs), with a focus on data-centric properties.
Our analysis yields formal insights into the limitations of GGAs and the
necessity of task-relevant augmentations. As we empirically show, GGAs do not
induce task-relevant invariances on common benchmark datasets, leading to only
marginal gains over naive, untrained baselines. Our theory motivates a
synthetic data generation process that enables control over task-relevant
information and boasts pre-defined optimal augmentations. This flexible
benchmark helps us identify yet unrecognized limitations in advanced
augmentation techniques (e.g., automated methods). Overall, our work rigorously
contextualizes, both empirically and theoretically, the effects of data-centric
properties on augmentation strategies and learning paradigms for graph SSL.
Related papers
- Towards Graph Contrastive Learning: A Survey and Beyond [23.109430624817637]
Self-supervised learning (SSL) on graphs has gained increasing attention and has made significant progress.
SSL enables machine learning models to produce informative representations from unlabeled graph data.
Graph Contrastive Learning (GCL) has not been thoroughly investigated in the existing literature.
arXiv Detail & Related papers (2024-05-20T08:19:10Z) - ExGRG: Explicitly-Generated Relation Graph for Self-Supervised Representation Learning [4.105236597768038]
Self-supervised learning has emerged as a powerful technique in pre-training deep learning models.
This paper introduces a novel non-contrastive SSL approach to Explicitly Generate a compositional Relation Graph.
arXiv Detail & Related papers (2024-02-09T19:16:04Z) - LightGCL: Simple Yet Effective Graph Contrastive Learning for
Recommendation [9.181689366185038]
Graph neural clustering network (GNN) is a powerful learning approach for graph-based recommender systems.
In this paper, we propose a simple yet effective graph contrastive learning paradigm LightGCL.
arXiv Detail & Related papers (2023-02-16T10:16:21Z) - Localized Contrastive Learning on Graphs [110.54606263711385]
We introduce a simple yet effective contrastive model named Localized Graph Contrastive Learning (Local-GCL)
In spite of its simplicity, Local-GCL achieves quite competitive performance in self-supervised node representation learning tasks on graphs with various scales and properties.
arXiv Detail & Related papers (2022-12-08T23:36:00Z) - Diving into Unified Data-Model Sparsity for Class-Imbalanced Graph
Representation Learning [30.23894624193583]
Graph Neural Networks (GNNs) training upon non-Euclidean graph data often encounters relatively higher time costs.
We develop a unified data-model dynamic sparsity framework named Graph Decantation (GraphDec) to address challenges brought by training upon a massive class-imbalanced graph data.
arXiv Detail & Related papers (2022-10-01T01:47:00Z) - Graph Structure Learning with Variational Information Bottleneck [70.62851953251253]
We propose a novel Variational Information Bottleneck guided Graph Structure Learning framework, namely VIB-GSL.
VIB-GSL learns an informative and compressive graph structure to distill the actionable information for specific downstream tasks.
arXiv Detail & Related papers (2021-12-16T14:22:13Z) - Self-supervised on Graphs: Contrastive, Generative,or Predictive [25.679620842010422]
Self-supervised learning (SSL) is emerging as a new paradigm for extracting informative knowledge through well-designed pretext tasks.
We divide existing graph SSL methods into three categories: contrastive, generative, and predictive.
We also summarize the commonly used datasets, evaluation metrics, downstream tasks, and open-source implementations of various algorithms.
arXiv Detail & Related papers (2021-05-16T03:30:03Z) - Graph-based Semi-supervised Learning: A Comprehensive Review [51.26862262550445]
Semi-supervised learning (SSL) has tremendous value in practice due to its ability to utilize both labeled data and unlabelled data.
An important class of SSL methods is to naturally represent data as graphs, which corresponds to graph-based semi-supervised learning (GSSL) methods.
GSSL methods have demonstrated their advantages in various domains due to their uniqueness of structure, the universality of applications, and their scalability to large scale data.
arXiv Detail & Related papers (2021-02-26T05:11:09Z) - Learning the Implicit Semantic Representation on Graph-Structured Data [57.670106959061634]
Existing representation learning methods in graph convolutional networks are mainly designed by describing the neighborhood of each node as a perceptual whole.
We propose a Semantic Graph Convolutional Networks (SGCN) that explores the implicit semantics by learning latent semantic-paths in graphs.
arXiv Detail & Related papers (2021-01-16T16:18:43Z) - Robust Optimization as Data Augmentation for Large-scale Graphs [117.2376815614148]
We propose FLAG (Free Large-scale Adversarial Augmentation on Graphs), which iteratively augments node features with gradient-based adversarial perturbations during training.
FLAG is a general-purpose approach for graph data, which universally works in node classification, link prediction, and graph classification tasks.
arXiv Detail & Related papers (2020-10-19T21:51:47Z) - Self-supervised Learning on Graphs: Deep Insights and New Direction [66.78374374440467]
Self-supervised learning (SSL) aims to create domain specific pretext tasks on unlabeled data.
There are increasing interests in generalizing deep learning to the graph domain in the form of graph neural networks (GNNs)
arXiv Detail & Related papers (2020-06-17T20:30:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.