Does GCL Need a Large Number of Negative Samples? Enhancing Graph Contrastive Learning with Effective and Efficient Negative Sampling
- URL: http://arxiv.org/abs/2503.17908v1
- Date: Sun, 23 Mar 2025 03:09:31 GMT
- Title: Does GCL Need a Large Number of Negative Samples? Enhancing Graph Contrastive Learning with Effective and Efficient Negative Sampling
- Authors: Yongqi Huang, Jitao Zhao, Dongxiao He, Di Jin, Yuxiao Huang, Zhen Wang,
- Abstract summary: Graph Contrastive Learning aims to self-supervised learn low-dimensional graph representations.<n>A consensus has been reached that the effectiveness of GCLs depends on a large number of negative pairs.<n>We propose a new method called GCL with Effective and Efficient Negative samples, E2Neg.
- Score: 17.51838571216239
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph Contrastive Learning (GCL) aims to self-supervised learn low-dimensional graph representations, primarily through instance discrimination, which involves manually mining positive and negative pairs from graphs, increasing the similarity of positive pairs while decreasing negative pairs. Drawing from the success of Contrastive Learning (CL) in other domains, a consensus has been reached that the effectiveness of GCLs depends on a large number of negative pairs. As a result, despite the significant computational overhead, GCLs typically leverage as many negative node pairs as possible to improve model performance. However, given that nodes within a graph are interconnected, we argue that nodes cannot be treated as independent instances. Therefore, we challenge this consensus: Does employing more negative nodes lead to a more effective GCL model? To answer this, we explore the role of negative nodes in the commonly used InfoNCE loss for GCL and observe that: (1) Counterintuitively, a large number of negative nodes can actually hinder the model's ability to distinguish nodes with different semantics. (2) A smaller number of high-quality and non-topologically coupled negative nodes are sufficient to enhance the discriminability of representations. Based on these findings, we propose a new method called GCL with Effective and Efficient Negative samples, E2Neg, which learns discriminative representations using only a very small set of representative negative samples. E2Neg significantly reduces computational overhead and speeds up model training. We demonstrate the effectiveness and efficiency of E2Neg across multiple datasets compared to other GCL methods.
Related papers
- Bootstrap Latents of Nodes and Neighbors for Graph Self-Supervised Learning [27.278097015083343]
Contrastive learning requires negative samples to prevent model collapse and learn discriminative representations.
We introduce a cross-attention module to predict the supportiveness score of a neighbor with respect to the anchor node.
Our method mitigates class collision from negative and noisy positive samples, concurrently enhancing intra-class compactness.
arXiv Detail & Related papers (2024-08-09T14:17:52Z) - Topology Reorganized Graph Contrastive Learning with Mitigating Semantic Drift [28.83750578838018]
Graph contrastive learning (GCL) is an effective paradigm for node representation learning in graphs.
To increase the diversity of the contrastive view, we propose two simple and effective global topological augmentations to compensate current GCL.
arXiv Detail & Related papers (2024-07-23T13:55:33Z) - Graph Ranking Contrastive Learning: A Extremely Simple yet Efficient Method [17.760628718072144]
InfoNCE uses augmentation techniques to obtain two views, where a node in one view acts as the anchor, the corresponding node in the other view serves as the positive sample, and all other nodes are regarded as negative samples.
The goal is to minimize the distance between the anchor node and positive samples and maximize the distance to negative samples.
Due to the lack of label information during training, InfoNCE inevitably treats samples from the same class as negative samples, leading to the issue of false negative samples.
We propose GraphRank, a simple yet efficient graph contrastive learning method that addresses the problem of false negative samples
arXiv Detail & Related papers (2023-10-23T03:15:57Z) - Pseudo Contrastive Learning for Graph-based Semi-supervised Learning [67.37572762925836]
Pseudo Labeling is a technique used to improve the performance of Graph Neural Networks (GNNs)
We propose a general framework for GNNs, termed Pseudo Contrastive Learning (PCL)
arXiv Detail & Related papers (2023-02-19T10:34:08Z) - Affinity Uncertainty-based Hard Negative Mining in Graph Contrastive
Learning [27.728860211993368]
Hard negative mining has shown effective in enhancing self-supervised contrastive learning (CL) on diverse data types.
This article proposes a novel approach that builds a discriminative model on collective affinity information to mine hard negatives in graph data.
Experiments on ten graph datasets show that our approach consistently enhances different state-of-the-art (SOTA) GCL methods in both graph and node classification tasks.
arXiv Detail & Related papers (2023-01-31T00:18:03Z) - STERLING: Synergistic Representation Learning on Bipartite Graphs [78.86064828220613]
A fundamental challenge of bipartite graph representation learning is how to extract node embeddings.
Most recent bipartite graph SSL methods are based on contrastive learning which learns embeddings by discriminating positive and negative node pairs.
We introduce a novel synergistic representation learning model (STERLING) to learn node embeddings without negative node pairs.
arXiv Detail & Related papers (2023-01-25T03:21:42Z) - Localized Contrastive Learning on Graphs [110.54606263711385]
We introduce a simple yet effective contrastive model named Localized Graph Contrastive Learning (Local-GCL)
In spite of its simplicity, Local-GCL achieves quite competitive performance in self-supervised node representation learning tasks on graphs with various scales and properties.
arXiv Detail & Related papers (2022-12-08T23:36:00Z) - Rethinking and Scaling Up Graph Contrastive Learning: An Extremely
Efficient Approach with Group Discrimination [87.07410882094966]
Graph contrastive learning (GCL) alleviates the heavy reliance on label information for graph representation learning (GRL)
We introduce a new learning paradigm for self-supervised GRL, namely, Group Discrimination (GD)
Instead of similarity computation, GGD directly discriminates two groups of summarised node instances with a simple binary cross-entropy loss.
In addition, GGD requires much fewer training epochs to obtain competitive performance compared with GCL methods on large-scale datasets.
arXiv Detail & Related papers (2022-06-03T12:32:47Z) - Debiased Graph Contrastive Learning [27.560217866753938]
We propose a novel and effective method to estimate the probability whether each negative sample is true or not.
Debiased Graph Contrastive Learning (DGCL) outperforms or matches previous unsupervised state-of-the-art results on several benchmarks.
arXiv Detail & Related papers (2021-10-05T13:15:59Z) - Investigating the Role of Negatives in Contrastive Representation
Learning [59.30700308648194]
Noise contrastive learning is a popular technique for unsupervised representation learning.
We focus on disambiguating the role of one of these parameters: the number of negative examples.
We find that the results broadly agree with our theory, while our vision experiments are murkier with performance sometimes even being insensitive to the number of negatives.
arXiv Detail & Related papers (2021-06-18T06:44:16Z) - Prototypical Graph Contrastive Learning [141.30842113683775]
We propose a Prototypical Graph Contrastive Learning (PGCL) approach to mitigate the critical sampling bias issue.
Specifically, PGCL models the underlying semantic structure of the graph data via clustering semantically similar graphs into the same group, and simultaneously encourages the clustering consistency for different augmentations of the same graph.
For a query, PGCL further reweights its negative samples based on the distance between their prototypes (cluster centroids) and the query prototype.
arXiv Detail & Related papers (2021-06-17T16:45:31Z) - Contrastive Attraction and Contrastive Repulsion for Representation
Learning [131.72147978462348]
Contrastive learning (CL) methods learn data representations in a self-supervision manner, where the encoder contrasts each positive sample over multiple negative samples.
Recent CL methods have achieved promising results when pretrained on large-scale datasets, such as ImageNet.
We propose a doubly CL strategy that separately compares positive and negative samples within their own groups, and then proceeds with a contrast between positive and negative groups.
arXiv Detail & Related papers (2021-05-08T17:25:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.