Related papers: Rethinking and Scaling Up Graph Contrastive Learning: An Extremely Efficient Approach with Group Discrimination

Rethinking and Scaling Up Graph Contrastive Learning: An Extremely Efficient Approach with Group Discrimination

URL: http://arxiv.org/abs/2206.01535v1
Date: Fri, 3 Jun 2022 12:32:47 GMT
Title: Rethinking and Scaling Up Graph Contrastive Learning: An Extremely Efficient Approach with Group Discrimination
Authors: Yizhen Zheng, Shirui Pan, Vincent Cs Lee, Yu Zheng, Philip S. Yu
Abstract summary: Graph contrastive learning (GCL) alleviates the heavy reliance on label information for graph representation learning (GRL) We introduce a new learning paradigm for self-supervised GRL, namely, Group Discrimination (GD) Instead of similarity computation, GGD directly discriminates two groups of summarised node instances with a simple binary cross-entropy loss. In addition, GGD requires much fewer training epochs to obtain competitive performance compared with GCL methods on large-scale datasets.
Score: 87.07410882094966
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Graph contrastive learning (GCL) alleviates the heavy reliance on label information for graph representation learning (GRL) via self-supervised learning schemes. The core idea is to learn by maximising mutual information for similar instances, which requires similarity computation between two node instances. However, this operation can be computationally expensive. For example, the time complexity of two commonly adopted contrastive loss functions (i.e., InfoNCE and JSD estimator) for a node is $O(ND)$ and $O(D)$, respectively, where $N$ is the number of nodes, and $D$ is the embedding dimension. Additionally, GCL normally requires a large number of training epochs to be well-trained on large-scale datasets. Inspired by an observation of a technical defect (i.e., inappropriate usage of Sigmoid function) commonly used in two representative GCL works, DGI and MVGRL, we revisit GCL and introduce a new learning paradigm for self-supervised GRL, namely, Group Discrimination (GD), and propose a novel GD-based method called Graph Group Discrimination (GGD). Instead of similarity computation, GGD directly discriminates two groups of summarised node instances with a simple binary cross-entropy loss. As such, GGD only requires $O(1)$ for loss computation of a node. In addition, GGD requires much fewer training epochs to obtain competitive performance compared with GCL methods on large-scale datasets. These two advantages endow GGD with the very efficient property. Extensive experiments show that GGD outperforms state-of-the-art self-supervised methods on 8 datasets. In particular, GGD can be trained in 0.18 seconds (6.44 seconds including data preprocessing) on ogbn-arxiv, which is orders of magnitude (10,000+ faster than GCL baselines} while consuming much less memory. Trained with 9 hours on ogbn-papers100M with billion edges, GGD outperforms its GCL counterparts in both accuracy and efficiency.

Related papers

Efficient Graph Similarity Computation with Alignment Regularization [7.143879014059894]
Graph similarity computation (GSC) is a learning-based prediction task using Graph Neural Networks (GNNs) We show that high-quality learning can be attained with a simple yet powerful regularization technique, which we call the Alignment Regularization (AReg) In the inference stage, the graph-level representations learned by the GNN encoder are directly used to compute the similarity score without using AReg again to speed up inference.
arXiv Detail & Related papers (2024-06-21T07:37:28Z)
Rethinking and Accelerating Graph Condensation: A Training-Free Approach with Class Partition [56.26113670151363]
Graph condensation is a data-centric solution to replace the large graph with a small yet informative condensed graph. Existing GC methods suffer from intricate optimization processes, necessitating excessive computing resources. We propose a training-free GC framework termed Class-partitioned Graph Condensation (CGC) CGC achieves state-of-the-art performance with a more efficient condensation process.
arXiv Detail & Related papers (2024-05-22T14:57:09Z)
Sequential Gradient Coding For Straggler Mitigation [28.090458692750023]
In distributed computing, slower nodes (stragglers) usually become a bottleneck. Gradient Coding (GC) is an efficient technique that uses principles of error-correcting codes to distribute gradient computation in the presence of stragglers. We propose two schemes that demonstrate improved performance compared to GC.
arXiv Detail & Related papers (2022-11-24T21:12:49Z)
Graph Soft-Contrastive Learning via Neighborhood Ranking [19.241089079154044]
Graph Contrastive Learning (GCL) has emerged as a promising approach in the realm of graph self-supervised learning. We propose a novel paradigm, Graph Soft-Contrastive Learning (GSCL) GSCL facilitates GCL via neighborhood ranking, avoiding the need to specify absolutely similar pairs.
arXiv Detail & Related papers (2022-09-28T09:52:15Z)
Mixed Graph Contrastive Network for Semi-Supervised Node Classification [63.924129159538076]
We propose a novel graph contrastive learning method, termed Mixed Graph Contrastive Network (MGCN) In our method, we improve the discriminative capability of the latent embeddings by an unperturbed augmentation strategy and a correlation reduction mechanism. By combining the two settings, we extract rich supervision information from both the abundant nodes and the rare yet valuable labeled nodes for discriminative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z)
Geometric Graph Representation Learning via Maximizing Rate Reduction [73.6044873825311]
Learning node representations benefits various downstream tasks in graph analysis such as community detection and node classification. We propose Geometric Graph Representation Learning (G2R) to learn node representations in an unsupervised manner. G2R maps nodes in distinct groups into different subspaces, while each subspace is compact and different subspaces are dispersed.
arXiv Detail & Related papers (2022-02-13T07:46:24Z)
Graph Communal Contrastive Learning [34.85906025283825]
A fundamental problem for graph representation learning is how to effectively learn representations without human labeling. We propose a novel Graph Contrastive Learning (gCooL) framework to jointly learn the community partition and learn node representations.
arXiv Detail & Related papers (2021-10-28T02:57:54Z)
Adversarial Graph Augmentation to Improve Graph Contrastive Learning [21.54343383921459]
We propose a novel principle, termed adversarial-GCL (AD-GCL), which enables GNNs to avoid capturing redundant information during the training. We experimentally validate AD-GCL by comparing with the state-of-the-art GCL methods and achieve performance gains of up-to $14%$ in unsupervised, $6%$ in transfer, and $3%$ in semi-supervised learning settings.
arXiv Detail & Related papers (2021-06-10T15:34:26Z)
Gradient Coding with Dynamic Clustering for Straggler-Tolerant Distributed Learning [55.052517095437]
gradient descent (GD) is widely employed to parallelize the learning task by distributing the dataset across multiple workers. A significant performance bottleneck for the per-iteration completion time in distributed synchronous GD is $straggling$ workers. Coded distributed techniques have been introduced recently to mitigate stragglers and to speed up GD iterations by assigning redundant computations to workers. We propose a novel dynamic GC scheme, which assigns redundant data to workers to acquire the flexibility to choose from among a set of possible codes depending on the past straggling behavior.
arXiv Detail & Related papers (2021-03-01T18:51:29Z)
Scalable Graph Neural Networks via Bidirectional Propagation [89.70835710988395]
Graph Neural Networks (GNN) is an emerging field for learning on non-Euclidean data. This paper presents GBP, a scalable GNN that utilizes a localized bidirectional propagation process from both the feature vectors and the training/testing nodes. An empirical study demonstrates that GBP achieves state-of-the-art performance with significantly less training/testing time.
arXiv Detail & Related papers (2020-10-29T08:55:33Z)
Towards Deeper Graph Neural Networks with Differentiable Group Normalization [61.20639338417576]
Graph neural networks (GNNs) learn the representation of a node by aggregating its neighbors. Over-smoothing is one of the key issues which limit the performance of GNNs as the number of layers increases. We introduce two over-smoothing metrics and a novel technique, i.e., differentiable group normalization (DGN)
arXiv Detail & Related papers (2020-06-12T07:18:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.