Accurate and Fast Estimation of Temporal Motifs using Path Sampling
- URL: http://arxiv.org/abs/2409.08975v2
- Date: Sun, 06 Oct 2024 18:27:55 GMT
- Title: Accurate and Fast Estimation of Temporal Motifs using Path Sampling
- Authors: Yunjie Pan, Omkar Bhalerao, C. Seshadhri, Nishil Talati,
- Abstract summary: Counting the number of small subgraphs, called motifs, is a fundamental problem in social network analysis and graph mining.
We propose an algorithm, TEACUPS, that addresses this problem using a novel technique of temporal path sampling.
For a Bitcoin graph with hundreds of millions of edges, TEACUPS runs in less than 1 minute, while the exact counting algorithm takes more than a day.
- Score: 5.114632024458956
- License:
- Abstract: Counting the number of small subgraphs, called motifs, is a fundamental problem in social network analysis and graph mining. Many real-world networks are directed and temporal, where edges have timestamps. Motif counting in directed, temporal graphs is especially challenging because there are a plethora of different kinds of patterns. Temporal motif counts reveal much richer information and there is a need for scalable algorithms for motif counting. A major challenge in counting is that there can be trillions of temporal motif matches even with a graph with only millions of vertices. Both the motifs and the input graphs can have multiple edges between two vertices, leading to a combinatorial explosion problem. Counting temporal motifs involving just four vertices is not feasible with current state-of-the-art algorithms. We design an algorithm, TEACUPS, that addresses this problem using a novel technique of temporal path sampling. We combine a path sampling method with carefully designed temporal data structures, to propose an efficient approximate algorithm for temporal motif counting. TEACUPS is an unbiased estimator with provable concentration behavior, which can be used to bound the estimation error. For a Bitcoin graph with hundreds of millions of edges, TEACUPS runs in less than 1 minute, while the exact counting algorithm takes more than a day. We empirically demonstrate the accuracy of TEACUPS on large datasets, showing an average of 30$\times$ speedup (up to 2000$\times$ speedup) compared to existing GPU-based exact counting methods while preserving high count estimation accuracy.
Related papers
- Representation Learning for Frequent Subgraph Mining [64.32430554934021]
Subgraph Pattern Miner (SPMiner) is a novel neural approach for finding frequent subgraphs in a large target graph.
For 5- and 6-node motifs, SPMiner can almost perfectly identify the most frequent motifs while being 100x faster than exact enumeration methods.
SPMiner can also reliably identify frequent 10-node motifs, which is well beyond the size limit of exact enumeration approaches.
arXiv Detail & Related papers (2024-02-22T08:11:22Z) - Scalable Pattern Matching in Computation Graphs [0.0]
Graph rewriting is a popular tool for optimisation and modification of graph expressions.
We propose a new solution to pattern matching in port graphs.
Our algorithm offers a 20x speedup over current implementations on a dataset of 10'000 real world patterns.
arXiv Detail & Related papers (2024-02-20T15:02:24Z) - Learning Graph Search Heuristics [48.83557172525969]
We present PHIL (Path Heuristic with Imitation Learning), a novel neural architecture and a training algorithm for discovering graph search and navigations from data.
Our function learns graph embeddings useful for inferring node distances, runs in constant time independent of graph sizes, and can be easily incorporated in an algorithm such as A* at test time.
Experiments show that PHIL reduces the number of explored nodes compared to state-of-the-art methods on benchmark datasets by 58.5% on average.
arXiv Detail & Related papers (2022-12-07T22:28:00Z) - Scalable Motif Counting for Large-scale Temporal Graphs [25.90869257290865]
We propose a scalable parallel framework for exactly counting temporal motifs in large-scale temporal graphs.
Based on the proposed counting algorithms, we design a hierarchical parallel framework that features both inter- and intra-node parallel strategies.
Experiments on sixteen real-world temporal graph datasets demonstrate the superiority and capability of our proposed framework.
arXiv Detail & Related papers (2022-04-20T05:41:38Z) - odeN: Simultaneous Approximation of Multiple Motif Counts in Large
Temporal Networks [7.025709586759655]
One of the main complications in studying temporal motifs is the large number of motifs that can be built.
We propose odeN, a sampling-based algorithm that provides an accurate approximation of all the counts of the motifs.
arXiv Detail & Related papers (2021-08-19T15:06:05Z) - Sketch-Based Anomaly Detection in Streaming Graphs [89.52200264469364]
Given a stream of graph edges from a dynamic graph, how can we assign anomaly scores to edges and subgraphs in an online manner?
Our method is the first streaming approach that incorporates dense subgraph search to detect graph anomalies in constant memory and time.
arXiv Detail & Related papers (2021-06-08T16:10:36Z) - Local Algorithms for Estimating Effective Resistance [26.54556748340991]
In this work, we design several emphlocal algorithms for estimating effective resistances on massive graphs.
Our main algorithm approximates the effective resistance between any vertices pair $s,t$ with an arbitrarily small additive error.
We perform an extensive empirical study on several benchmark datasets, validating our algorithms.
arXiv Detail & Related papers (2021-06-07T10:08:12Z) - Learnable Graph Matching: Incorporating Graph Partitioning with Deep
Feature Learning for Multiple Object Tracking [58.30147362745852]
Data association across frames is at the core of Multiple Object Tracking (MOT) task.
Existing methods mostly ignore the context information among tracklets and intra-frame detections.
We propose a novel learnable graph matching method to address these issues.
arXiv Detail & Related papers (2021-03-30T08:58:45Z) - Scalable Graph Neural Networks via Bidirectional Propagation [89.70835710988395]
Graph Neural Networks (GNN) is an emerging field for learning on non-Euclidean data.
This paper presents GBP, a scalable GNN that utilizes a localized bidirectional propagation process from both the feature vectors and the training/testing nodes.
An empirical study demonstrates that GBP achieves state-of-the-art performance with significantly less training/testing time.
arXiv Detail & Related papers (2020-10-29T08:55:33Z) - Efficient Sampling Algorithms for Approximate Temporal Motif Counting
(Extended Version) [24.33313864327473]
We propose a generic edge sampling (ES) algorithm for estimating the number of instances of any temporal motif.
We also devise an improved EWS algorithm that hybridizes edge sampling with wedge sampling for counting temporal motifs with 3 vertices and 3 edges.
Our algorithms have higher efficiency, better accuracy, and greater scalability than the state-of-the-art sampling method for temporal motif counting.
arXiv Detail & Related papers (2020-07-28T07:15:25Z) - Online Dense Subgraph Discovery via Blurred-Graph Feedback [87.9850024070244]
We introduce a novel learning problem for dense subgraph discovery.
We first propose a edge-time algorithm that obtains a nearly-optimal solution with high probability.
We then design a more scalable algorithm with a theoretical guarantee.
arXiv Detail & Related papers (2020-06-24T11:37:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.