Learning Graph Node Embeddings by Smooth Pair Sampling
- URL: http://arxiv.org/abs/2501.12884v1
- Date: Wed, 22 Jan 2025 13:51:33 GMT
- Title: Learning Graph Node Embeddings by Smooth Pair Sampling
- Authors: Konstantin Kutzkov,
- Abstract summary: Random walk-based node embedding algorithms have attracted a lot of attention due to their scalability and ease of implementation.
Previous research has focused on different walk strategies, optimization objectives, and embedding learning models.
Inspired by observations on real data, we take a different approach and propose a new regularization technique.
- Score: 5.167069404528051
- License:
- Abstract: Random walk-based node embedding algorithms have attracted a lot of attention due to their scalability and ease of implementation. Previous research has focused on different walk strategies, optimization objectives, and embedding learning models. Inspired by observations on real data, we take a different approach and propose a new regularization technique. More precisely, the frequencies of node pairs generated by the skip-gram model on random walk node sequences follow a highly skewed distribution which causes learning to be dominated by a fraction of the pairs. We address the issue by designing an efficient sampling procedure that generates node pairs according to their {\em smoothed frequency}. Theoretical and experimental results demonstrate the advantages of our approach.
Related papers
- Sparse Decomposition of Graph Neural Networks [20.768412002413843]
We propose an approach to reduce the number of nodes that are included during aggregation.
We achieve this through a sparse decomposition, learning to approximate node representations using a weighted sum of linearly transformed features.
We demonstrate via extensive experiments that our method outperforms other baselines designed for inference speedup.
arXiv Detail & Related papers (2024-10-25T17:52:16Z) - Ensemble Quadratic Assignment Network for Graph Matching [52.20001802006391]
Graph matching is a commonly used technique in computer vision and pattern recognition.
Recent data-driven approaches have improved the graph matching accuracy remarkably.
We propose a graph neural network (GNN) based approach to combine the advantages of data-driven and traditional methods.
arXiv Detail & Related papers (2024-03-11T06:34:05Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - LoNe Sampler: Graph node embeddings by coordinated local neighborhood
sampling [0.7614628596146599]
Local graph neighborhood sampling is a fundamental computational problem that is at the heart of algorithms for node representation learning.
We present LoNe Sampler, a suite of algorithms for generating discrete node embeddings by Local Neighborhood Sampling.
arXiv Detail & Related papers (2022-11-28T08:04:26Z) - Calibrate and Debias Layer-wise Sampling for Graph Convolutional
Networks [39.56471534442315]
This paper revisits the approach from a matrix approximation perspective.
We propose a new principle for constructing sampling probabilities and an efficient debiasing algorithm.
Improvements are demonstrated by extensive analyses of estimation variance and experiments on common benchmarks.
arXiv Detail & Related papers (2022-06-01T15:52:06Z) - Optimal Propagation for Graph Neural Networks [51.08426265813481]
We propose a bi-level optimization approach for learning the optimal graph structure.
We also explore a low-rank approximation model for further reducing the time complexity.
arXiv Detail & Related papers (2022-05-06T03:37:00Z) - Data-heterogeneity-aware Mixing for Decentralized Learning [63.83913592085953]
We characterize the dependence of convergence on the relationship between the mixing weights of the graph and the data heterogeneity across nodes.
We propose a metric that quantifies the ability of a graph to mix the current gradients.
Motivated by our analysis, we propose an approach that periodically and efficiently optimize the metric.
arXiv Detail & Related papers (2022-04-13T15:54:35Z) - Learning with Neighbor Consistency for Noisy Labels [69.83857578836769]
We present a method for learning from noisy labels that leverages similarities between training examples in feature space.
We evaluate our method on datasets evaluating both synthetic (CIFAR-10, CIFAR-100) and realistic (mini-WebVision, Clothing1M, mini-ImageNet-Red) noise.
arXiv Detail & Related papers (2022-02-04T15:46:27Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - Learning Representations using Spectral-Biased Random Walks on Graphs [18.369974607582584]
We study how much a probabilistic bias in this process affects the quality of the nodes picked by the process.
We succinctly capture this neighborhood as a probability measure based on the spectrum of the node's neighborhood subgraph represented as a normalized laplacian matrix.
We empirically evaluate our approach against several state-of-the-art node embedding techniques on a wide variety of real-world datasets.
arXiv Detail & Related papers (2020-05-19T20:42:43Z) - Investigating Extensions to Random Walk Based Graph Embedding [0.3867052484157571]
We propose a novel extension to random walk based graph embedding, which removes a percentage of least frequent nodes from the walks at different levels.
By this removal, we simulate farther distant nodes to reside in the close neighborhood of a node and hence explicitly represent their connection.
The results indicate, that extensions to random walk based methods (including our own) improve the predictive performance only slightly - if at all.
arXiv Detail & Related papers (2020-02-17T21:14:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.