Enhancing Graph Contrastive Learning with Node Similarity
- URL: http://arxiv.org/abs/2208.06743v1
- Date: Sat, 13 Aug 2022 22:49:20 GMT
- Title: Enhancing Graph Contrastive Learning with Node Similarity
- Authors: Hongliang Chi, Yao Ma
- Abstract summary: Graph contrastive learning (GCL) is a representative framework for self-supervised learning.
GCL learns node representations by contrasting semantically similar nodes (positive samples) and dissimilar nodes (negative samples) with anchor nodes.
We propose an enhanced objective that contains all positive samples and no false-negative samples.
- Score: 4.60032347615771
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Graph Neural Networks (GNNs) have achieved great success in learning graph
representations and thus facilitating various graph-related tasks. However,
most GNN methods adopt a supervised learning setting, which is not always
feasible in real-world applications due to the difficulty to obtain labeled
data. Hence, graph self-supervised learning has been attracting increasing
attention. Graph contrastive learning (GCL) is a representative framework for
self-supervised learning. In general, GCL learns node representations by
contrasting semantically similar nodes (positive samples) and dissimilar nodes
(negative samples) with anchor nodes. Without access to labels, positive
samples are typically generated by data augmentation, and negative samples are
uniformly sampled from the entire graph, which leads to a sub-optimal
objective. Specifically, data augmentation naturally limits the number of
positive samples that involve in the process (typically only one positive
sample is adopted). On the other hand, the random sampling process would
inevitably select false-negative samples (samples sharing the same semantics
with the anchor). These issues limit the learning capability of GCL. In this
work, we propose an enhanced objective that addresses the aforementioned
issues. We first introduce an unachievable ideal objective that contains all
positive samples and no false-negative samples. This ideal objective is then
transformed into a probabilistic form based on the distributions for sampling
positive and negative samples. We then model these distributions with node
similarity and derive the enhanced objective. Comprehensive experiments on
various datasets demonstrate the effectiveness of the proposed enhanced
objective under different settings.
Related papers
- Bootstrap Latents of Nodes and Neighbors for Graph Self-Supervised Learning [27.278097015083343]
Contrastive learning requires negative samples to prevent model collapse and learn discriminative representations.
We introduce a cross-attention module to predict the supportiveness score of a neighbor with respect to the anchor node.
Our method mitigates class collision from negative and noisy positive samples, concurrently enhancing intra-class compactness.
arXiv Detail & Related papers (2024-08-09T14:17:52Z) - The Heterophilic Snowflake Hypothesis: Training and Empowering GNNs for Heterophilic Graphs [59.03660013787925]
We introduce the Heterophily Snowflake Hypothesis and provide an effective solution to guide and facilitate research on heterophilic graphs.
Our observations show that our framework acts as a versatile operator for diverse tasks.
It can be integrated into various GNN frameworks, boosting performance in-depth and offering an explainable approach to choosing the optimal network depth.
arXiv Detail & Related papers (2024-06-18T12:16:00Z) - Graph Ranking Contrastive Learning: A Extremely Simple yet Efficient Method [17.760628718072144]
InfoNCE uses augmentation techniques to obtain two views, where a node in one view acts as the anchor, the corresponding node in the other view serves as the positive sample, and all other nodes are regarded as negative samples.
The goal is to minimize the distance between the anchor node and positive samples and maximize the distance to negative samples.
Due to the lack of label information during training, InfoNCE inevitably treats samples from the same class as negative samples, leading to the issue of false negative samples.
We propose GraphRank, a simple yet efficient graph contrastive learning method that addresses the problem of false negative samples
arXiv Detail & Related papers (2023-10-23T03:15:57Z) - GRAPES: Learning to Sample Graphs for Scalable Graph Neural Networks [2.4175455407547015]
Graph neural networks learn to represent nodes by aggregating information from their neighbors.
Several existing methods address this by sampling a small subset of nodes, scaling GNNs to much larger graphs.
We introduce GRAPES, an adaptive sampling method that learns to identify the set of nodes crucial for training a GNN.
arXiv Detail & Related papers (2023-10-05T09:08:47Z) - STERLING: Synergistic Representation Learning on Bipartite Graphs [78.86064828220613]
A fundamental challenge of bipartite graph representation learning is how to extract node embeddings.
Most recent bipartite graph SSL methods are based on contrastive learning which learns embeddings by discriminating positive and negative node pairs.
We introduce a novel synergistic representation learning model (STERLING) to learn node embeddings without negative node pairs.
arXiv Detail & Related papers (2023-01-25T03:21:42Z) - Graph Convolutional Neural Networks with Diverse Negative Samples via
Decomposed Determinant Point Processes [21.792376993468064]
Graph convolutional networks (GCNs) have achieved great success in graph representation learning.
In this paper, we use quality-diversity decomposition in determinant point processes to obtain diverse negative samples.
We propose a new shortest-path-base method to improve computational efficiency.
arXiv Detail & Related papers (2022-12-05T06:31:31Z) - Similarity-aware Positive Instance Sampling for Graph Contrastive
Pre-training [82.68805025636165]
We propose to select positive graph instances directly from existing graphs in the training set.
Our selection is based on certain domain-specific pair-wise similarity measurements.
Besides, we develop an adaptive node-level pre-training method to dynamically mask nodes to distribute them evenly in the graph.
arXiv Detail & Related papers (2022-06-23T20:12:51Z) - Prototypical Graph Contrastive Learning [141.30842113683775]
We propose a Prototypical Graph Contrastive Learning (PGCL) approach to mitigate the critical sampling bias issue.
Specifically, PGCL models the underlying semantic structure of the graph data via clustering semantically similar graphs into the same group, and simultaneously encourages the clustering consistency for different augmentations of the same graph.
For a query, PGCL further reweights its negative samples based on the distance between their prototypes (cluster centroids) and the query prototype.
arXiv Detail & Related papers (2021-06-17T16:45:31Z) - SCE: Scalable Network Embedding from Sparsest Cut [20.08464038805681]
Large-scale network embedding is to learn a latent representation for each node in an unsupervised manner.
A key of success to such contrastive learning methods is how to draw positive and negative samples.
In this paper, we propose SCE for unsupervised network embedding only using negative samples for training.
arXiv Detail & Related papers (2020-06-30T03:18:15Z) - Sequential Graph Convolutional Network for Active Learning [53.99104862192055]
We propose a novel pool-based Active Learning framework constructed on a sequential Graph Convolution Network (GCN)
With a small number of randomly sampled images as seed labelled examples, we learn the parameters of the graph to distinguish labelled vs unlabelled nodes.
We exploit these characteristics of GCN to select the unlabelled examples which are sufficiently different from labelled ones.
arXiv Detail & Related papers (2020-06-18T00:55:10Z) - Understanding Negative Sampling in Graph Representation Learning [87.35038268508414]
We show that negative sampling is as important as positive sampling in determining the optimization objective and the resulted variance.
We propose Metropolis-Hastings (MCNS) to approximate the positive distribution with self-contrast approximation and accelerate negative sampling by Metropolis-Hastings.
We evaluate our method on 5 datasets that cover extensive downstream graph learning tasks, including link prediction, node classification and personalized recommendation.
arXiv Detail & Related papers (2020-05-20T06:25:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.