Related papers: Optimal Time Complexity Algorithms for Computing General Random Walk Graph Kernels on Sparse Graphs

Optimal Time Complexity Algorithms for Computing General Random Walk Graph Kernels on Sparse Graphs

URL: http://arxiv.org/abs/2410.10368v2
Date: Tue, 15 Oct 2024 14:12:43 GMT
Title: Optimal Time Complexity Algorithms for Computing General Random Walk Graph Kernels on Sparse Graphs
Authors: Krzysztof Choromanski, Isaac Reid, Arijit Sehanobish, Avinava Dubey,
Abstract summary: We present the first linear time complexity randomized algorithms for unbiased approximation of general random walk kernels (RWKs) for sparse graphs. Our method is up to $mathbf27times$ faster than its counterparts for efficient computation on large graphs.
Score: 14.049529046098607
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present the first linear time complexity randomized algorithms for unbiased approximation of the celebrated family of general random walk kernels (RWKs) for sparse graphs. This includes both labelled and unlabelled instances. The previous fastest methods for general RWKs were of cubic time complexity and not applicable to labelled graphs. Our method samples dependent random walks to compute novel graph embeddings in $\mathbb{R}^d$ whose dot product is equal to the true RWK in expectation. It does so without instantiating the direct product graph in memory, meaning we can scale to massive datasets that cannot be stored on a single machine. We derive exponential concentration bounds to prove that our estimator is sharp, and show that the ability to approximate general RWKs (rather than just special cases) unlocks efficient implicit graph kernel learning. Our method is up to $\mathbf{27\times}$ faster than its counterparts for efficient computation on large graphs and scales to graphs $\mathbf{128 \times}$ bigger than largest examples amenable to brute-force computation.

Related papers

Fast online node labeling with graph subsampling [4.259367043722417]
Graph-based methods, such as node prediction, aim for computational efficiency regardless of graph size. In this paper, we consider an emphonline subsampled APPR method, where messages are intentionally dropped at random. We use tools from graph sparsifiers and matrix linear algebra to give approximation bounds on the graph's spectral properties.
arXiv Detail & Related papers (2025-03-21T00:13:16Z)
Efficient Graph Matching for Correlated Stochastic Block Models [7.320365821066744]
We study learning problems on correlated block models with two balanced communities. Our main result gives the first efficient algorithm for graph matching in this setting. We extend this to an efficient algorithm for exact graph matching whenever this is information-theoretically possible.
arXiv Detail & Related papers (2024-12-03T18:36:45Z)
A Differentially Private Clustering Algorithm for Well-Clustered Graphs [6.523602840064548]
We provide an efficient ($epsilon,$delta$)-DP algorithm tailored specifically for such graphs. Our algorithm works for well-clustered graphs with $k$ nearly-balanced clusters.
arXiv Detail & Related papers (2024-03-21T11:57:16Z)
Fast and Simple Spectral Clustering in Theory and Practice [7.070726553564701]
Spectral clustering is a popular and effective algorithm designed to find $k$ clusters in a graph $G$. We present a simple spectral clustering algorithm based on a vertices embedding with $O(log(k))$ computed by the power method. We evaluate the new algorithm on several synthetic and real-world datasets, finding that it is significantly faster than alternative clustering algorithms, while producing results with approximately the same clustering accuracy.
arXiv Detail & Related papers (2023-10-17T02:31:57Z)
General Graph Random Features [42.75616308187867]
We propose a novel random walk-based algorithm for unbiased estimation of arbitrary functions of a weighted adjacency matrix. Our algorithm enjoys subquadratic time complexity with respect to the number of nodes, overcoming the notoriously prohibitive cubic scaling of exact graph kernel evaluation.
arXiv Detail & Related papers (2023-10-07T15:47:31Z)
AnchorGAE: General Data Clustering via $O(n)$ Bipartite Graph Convolution [79.44066256794187]
We show how to convert a non-graph dataset into a graph by introducing the generative graph model, which is used to build graph convolution networks (GCNs) A bipartite graph constructed by anchors is updated dynamically to exploit the high-level information behind data. We theoretically prove that the simple update will lead to degeneration and a specific strategy is accordingly designed.
arXiv Detail & Related papers (2021-11-12T07:08:13Z)
Fast Graph Kernel with Optical Random Features [17.403133838762447]
The graphlet kernel suffers from a high computation cost due to the isomorphism test it includes. We propose to leverage kernel random features within the graphlet framework, and establish a theoretical link with a mean kernel metric.
arXiv Detail & Related papers (2020-10-16T09:43:47Z)
Hybrid Stochastic-Deterministic Minibatch Proximal Gradient: Less-Than-Single-Pass Optimization with Nearly Optimal Generalization [83.80460802169999]
We show that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model. For loss factors, we prove that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
arXiv Detail & Related papers (2020-09-18T02:18:44Z)
Online Dense Subgraph Discovery via Blurred-Graph Feedback [87.9850024070244]
We introduce a novel learning problem for dense subgraph discovery. We first propose a edge-time algorithm that obtains a nearly-optimal solution with high probability. We then design a more scalable algorithm with a theoretical guarantee.
arXiv Detail & Related papers (2020-06-24T11:37:33Z)
Kernel methods through the roof: handling billions of points efficiently [94.31450736250918]
Kernel methods provide an elegant and principled approach to nonparametric learning, but so far could hardly be used in large scale problems. Recent advances have shown the benefits of a number of algorithmic ideas, for example combining optimization, numerical linear algebra and random projections. Here, we push these efforts further to develop and test a solver that takes full advantage of GPU hardware.
arXiv Detail & Related papers (2020-06-18T08:16:25Z)
Fast Graph Attention Networks Using Effective Resistance Based Graph Sparsification [70.50751397870972]
FastGAT is a method to make attention based GNNs lightweight by using spectral sparsification to generate an optimal pruning of the input graph. We experimentally evaluate FastGAT on several large real world graph datasets for node classification tasks.
arXiv Detail & Related papers (2020-06-15T22:07:54Z)
The Quantum Approximate Optimization Algorithm Needs to See the Whole Graph: A Typical Case [6.810856082577402]
The quantum circuit has p applications of a unitary operator that respects the locality of the graph. We focus on finding big independent sets in random graphs with dn/2 edges keeping d fixed and n large.
arXiv Detail & Related papers (2020-04-20T00:48:02Z)
Block-Approximated Exponential Random Graphs [77.4792558024487]
An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivial ERGs on large graphs. We propose an approximative framework to such non-trivial ERGs that result in dyadic independence (i.e., edge independent) distributions. Our methods are scalable to sparse graphs consisting of millions of nodes.
arXiv Detail & Related papers (2020-02-14T11:42:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.