Reinforcement Learning Enhanced Weighted Sampling for Accurate Subgraph
Counting on Fully Dynamic Graph Streams
- URL: http://arxiv.org/abs/2211.06793v1
- Date: Sun, 13 Nov 2022 03:01:34 GMT
- Title: Reinforcement Learning Enhanced Weighted Sampling for Accurate Subgraph
Counting on Fully Dynamic Graph Streams
- Authors: Kaixin Wang, Cheng Long, Da Yan, Jie Zhang, H. V. Jagadish
- Abstract summary: We propose a weighted sampling algorithm called WSD for estimating the subgraph count in a fully dynamic graph stream.
We determine the weights of edges in a data-driven fashion, using a novel method based on reinforcement learning.
- Score: 35.943447765433774
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As the popularity of graph data increases, there is a growing need to count
the occurrences of subgraph patterns of interest, for a variety of
applications. Many graphs are massive in scale and also fully dynamic (with
insertions and deletions of edges), rendering exact computation of these counts
to be infeasible. Common practice is, instead, to use a small set of edges as a
sample to estimate the counts. Existing sampling algorithms for fully dynamic
graphs sample the edges with uniform probability. In this paper, we show that
we can do much better if we sample edges based on their individual properties.
Specifically, we propose a weighted sampling algorithm called WSD for
estimating the subgraph count in a fully dynamic graph stream, which samples
the edges based on their weights that indicate their importance and reflect
their properties. We determine the weights of edges in a data-driven fashion,
using a novel method based on reinforcement learning. We conduct extensive
experiments to verify that our technique can produce estimates with smaller
errors while often running faster compared with existing algorithms.
Related papers
- Sparse Training of Discrete Diffusion Models for Graph Generation [45.103518022696996]
We introduce SparseDiff, a novel diffusion model based on the observation that almost all large graphs are sparse.
By selecting a subset of edges, SparseDiff effectively leverages sparse graph representations both during the noising process and within the denoising network.
Our model demonstrates state-of-the-art performance across multiple metrics on both small and large datasets.
arXiv Detail & Related papers (2023-11-03T16:50:26Z) - Sampling Enclosing Subgraphs for Link Prediction [2.1270496914042996]
Link prediction is a fundamental problem for graph-structured data computation.
This paper presents a scalable solution, that we call ScaLed, which utilizes sparse enclosing subgraphs to make predictions.
By leveraging the smaller sampled subgraph, ScaLed can scale to larger graphs with much less overhead while maintaining high accuracy.
arXiv Detail & Related papers (2022-06-23T22:48:44Z) - Ordered Subgraph Aggregation Networks [19.18478955240166]
Subgraph-enhanced graph neural networks (GNNs) have emerged, provably boosting the expressive power of standard (message-passing) GNNs.
Here, we introduce a theoretical framework and extend the known expressivity results of subgraph-enhanced GNNs.
We show that increasing subgraph size always increases the expressive power and develop a better understanding of their limitations.
arXiv Detail & Related papers (2022-06-22T15:19:34Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - Sketch-Based Anomaly Detection in Streaming Graphs [89.52200264469364]
Given a stream of graph edges from a dynamic graph, how can we assign anomaly scores to edges and subgraphs in an online manner?
Our method is the first streaming approach that incorporates dense subgraph search to detect graph anomalies in constant memory and time.
arXiv Detail & Related papers (2021-06-08T16:10:36Z) - Cherry-Picking Gradients: Learning Low-Rank Embeddings of Visual Data
via Differentiable Cross-Approximation [53.95297550117153]
We propose an end-to-end trainable framework that processes large-scale visual data tensors by looking emphat a fraction of their entries only.
The proposed approach is particularly useful for large-scale multidimensional grid data, and for tasks that require context over a large receptive field.
arXiv Detail & Related papers (2021-05-29T08:39:57Z) - EGL++: Extending Expected Gradient Length to Active Learning for Human
Pose Estimation [2.0305676256390934]
State of the art human pose estimation models rely on large quantities of labelled data for robust performance.
EGL++ is a novel algorithm that extends expected gradient length to tasks where discrete labels are not available.
arXiv Detail & Related papers (2021-04-19T17:56:59Z) - Online Dense Subgraph Discovery via Blurred-Graph Feedback [87.9850024070244]
We introduce a novel learning problem for dense subgraph discovery.
We first propose a edge-time algorithm that obtains a nearly-optimal solution with high probability.
We then design a more scalable algorithm with a theoretical guarantee.
arXiv Detail & Related papers (2020-06-24T11:37:33Z) - Bandit Samplers for Training Graph Neural Networks [63.17765191700203]
Several sampling algorithms with variance reduction have been proposed for accelerating the training of Graph Convolution Networks (GCNs)
These sampling algorithms are not applicable to more general graph neural networks (GNNs) where the message aggregator contains learned weights rather than fixed weights, such as Graph Attention Networks (GAT)
arXiv Detail & Related papers (2020-06-10T12:48:37Z) - Block-Approximated Exponential Random Graphs [77.4792558024487]
An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivial ERGs on large graphs.
We propose an approximative framework to such non-trivial ERGs that result in dyadic independence (i.e., edge independent) distributions.
Our methods are scalable to sparse graphs consisting of millions of nodes.
arXiv Detail & Related papers (2020-02-14T11:42:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.