Asymptotics of Network Embeddings Learned via Subsampling
- URL: http://arxiv.org/abs/2107.02363v4
- Date: Wed, 17 May 2023 15:18:53 GMT
- Title: Asymptotics of Network Embeddings Learned via Subsampling
- Authors: Andrew Davison and Morgane Austern
- Abstract summary: We study representation methods using a subsampling approach, such as node2vec, into a single unifying framework.
This provides a theoretical foundation to understand what the embedding vectors represent and how well these methods perform on downstream tasks.
Notably, we observe that typically used loss functions may lead to shortcomings, such as a lack of Fisher consistency.
- Score: 4.23373349945751
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Network data are ubiquitous in modern machine learning, with tasks of
interest including node classification, node clustering and link prediction. A
frequent approach begins by learning an Euclidean embedding of the network, to
which algorithms developed for vector-valued data are applied. For large
networks, embeddings are learned using stochastic gradient methods where the
sub-sampling scheme can be freely chosen. Despite the strong empirical
performance of such methods, they are not well understood theoretically. Our
work encapsulates representation methods using a subsampling approach, such as
node2vec, into a single unifying framework. We prove, under the assumption that
the graph is exchangeable, that the distribution of the learned embedding
vectors asymptotically decouples. Moreover, we characterize the asymptotic
distribution and provided rates of convergence, in terms of the latent
parameters, which includes the choice of loss function and the embedding
dimension. This provides a theoretical foundation to understand what the
embedding vectors represent and how well these methods perform on downstream
tasks. Notably, we observe that typically used loss functions may lead to
shortcomings, such as a lack of Fisher consistency.
Related papers
- Learning A Disentangling Representation For PU Learning [18.94726971543125]
We propose to learn a neural network-based data representation using a loss function that can be used to project the unlabeled data into two clusters.
We conduct experiments on simulated PU data that demonstrate the improved performance of our proposed method compared to the current state-of-the-art approaches.
arXiv Detail & Related papers (2023-10-05T18:33:32Z) - Semantic Strengthening of Neuro-Symbolic Learning [85.6195120593625]
Neuro-symbolic approaches typically resort to fuzzy approximations of a probabilistic objective.
We show how to compute this efficiently for tractable circuits.
We test our approach on three tasks: predicting a minimum-cost path in Warcraft, predicting a minimum-cost perfect matching, and solving Sudoku puzzles.
arXiv Detail & Related papers (2023-02-28T00:04:22Z) - Federated Representation Learning via Maximal Coding Rate Reduction [109.26332878050374]
We propose a methodology to learn low-dimensional representations from a dataset that is distributed among several clients.
Our proposed method, which we refer to as FLOW, utilizes MCR2 as the objective of choice, hence resulting in representations that are both between-class discriminative and within-class compressible.
arXiv Detail & Related papers (2022-10-01T15:43:51Z) - On the Benefits of Large Learning Rates for Kernel Methods [110.03020563291788]
We show that a phenomenon can be precisely characterized in the context of kernel methods.
We consider the minimization of a quadratic objective in a separable Hilbert space, and show that with early stopping, the choice of learning rate influences the spectral decomposition of the obtained solution.
arXiv Detail & Related papers (2022-02-28T13:01:04Z) - Asymptotics of $\ell_2$ Regularized Network Embeddings [0.0]
A common approach to solving tasks, such as node classification or link prediction, begins by learning a Euclidean embedding of the nodes of the network.
For unsupervised random walk methods such as DeepWalk and node2vec, adding a $ell$ penalty on the embedding vectors to the loss leads to improved downstream task performance.
arXiv Detail & Related papers (2022-01-05T16:33:14Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - Scaling Up Graph Neural Networks Via Graph Coarsening [18.176326897605225]
Scalability of graph neural networks (GNNs) is one of the major challenges in machine learning.
In this paper, we propose to use graph coarsening for scalable training of GNNs.
We show that, simply applying off-the-shelf coarsening methods, we can reduce the number of nodes by up to a factor of ten without causing a noticeable downgrade in classification accuracy.
arXiv Detail & Related papers (2021-06-09T15:46:17Z) - Embedding Propagation: Smoother Manifold for Few-Shot Classification [131.81692677836202]
We propose to use embedding propagation as an unsupervised non-parametric regularizer for manifold smoothing in few-shot classification.
We empirically show that embedding propagation yields a smoother embedding manifold.
We show that embedding propagation consistently improves the accuracy of the models in multiple semi-supervised learning scenarios by up to 16% points.
arXiv Detail & Related papers (2020-03-09T13:51:09Z) - Weakly-Supervised Semantic Segmentation by Iterative Affinity Learning [86.45526827323954]
Weakly-supervised semantic segmentation is a challenging task as no pixel-wise label information is provided for training.
We propose an iterative algorithm to learn such pairwise relations.
We show that the proposed algorithm performs favorably against the state-of-the-art methods.
arXiv Detail & Related papers (2020-02-19T10:32:03Z) - Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks
Trained with the Logistic Loss [0.0]
Neural networks trained to minimize the logistic (a.k.a. cross-entropy) loss with gradient-based methods are observed to perform well in many supervised classification tasks.
We analyze the training and generalization behavior of infinitely wide two-layer neural networks with homogeneous activations.
arXiv Detail & Related papers (2020-02-11T15:42:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.