Jaccard-constrained dense subgraph discovery
- URL: http://arxiv.org/abs/2308.15936v1
- Date: Wed, 30 Aug 2023 10:33:02 GMT
- Title: Jaccard-constrained dense subgraph discovery
- Authors: Chamalee Wickrama Arachchi and Nikolaj Tatti
- Abstract summary: We search for dense subgraphs that have large pairwise Jaccard similarity coefficients.
We show experimentally that our algorithms are efficient, they can find ground truth in synthetic datasets and provide interpretable results from real-world datasets.
- Score: 2.4511852052357628
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Finding dense subgraphs is a core problem in graph mining with many
applications in diverse domains. At the same time many real-world networks vary
over time, that is, the dataset can be represented as a sequence of graph
snapshots. Hence, it is natural to consider the question of finding dense
subgraphs in a temporal network that are allowed to vary over time to a certain
degree. In this paper, we search for dense subgraphs that have large pairwise
Jaccard similarity coefficients. More formally, given a set of graph snapshots
and a weight $\lambda$, we find a collection of dense subgraphs such that the
sum of densities of the induced subgraphs plus the sum of Jaccard indices,
weighted by $\lambda$, is maximized. We prove that this problem is NP-hard. To
discover dense subgraphs with good objective value, we present an iterative
algorithm which runs in $\mathcal{O}(n^2k^2 + m \log n + k^3 n)$ time per
single iteration, and a greedy algorithm which runs in $\mathcal{O}(n^2k^2 + m
\log n + k^3 n)$ time, where $k$ is the length of the graph sequence and $n$
and $m$ denote number of nodes and total number of edges respectively. We show
experimentally that our algorithms are efficient, they can find ground truth in
synthetic datasets and provide interpretable results from real-world datasets.
Finally, we present a case study that shows the usefulness of our problem.
Related papers
- On the Unlikelihood of D-Separation [69.62839677485087]
We provide analytic evidence that on large graphs, d-separation is a rare phenomenon, even when guaranteed to exist.
For the PC Algorithm, while it is known that its worst-case guarantees fail on non-sparse graphs, we show that the same is true for the average case.
For UniformSGS, while it is known that the running time is exponential for existing edges, we show that in the average case, that is the expected running time for most non-existing edges as well.
arXiv Detail & Related papers (2023-03-10T00:11:18Z) - Efficient Graph Field Integrators Meet Point Clouds [59.27295475120132]
We present two new classes of algorithms for efficient field integration on graphs encoding point clouds.
The first class, SeparatorFactorization(SF), leverages the bounded genus of point cloud mesh graphs, while the second class, RFDiffusion(RFD), uses popular epsilon-nearest-neighbor graph representations for point clouds.
arXiv Detail & Related papers (2023-02-02T08:33:36Z) - Correlation Clustering Algorithm for Dynamic Complete Signed Graphs: An
Index-based Approach [9.13755431537592]
In this paper, we reduce the complexity of approximating the correlation clustering problem from $O(mtimesleft( 2+ alpha (G) right)+n)$ to $O(m+n)$ for any given value of $varepsilon$ for a complete signed graph.
Our approach gives the same output as the original algorithm and makes it possible to implement the algorithm in a full dynamic setting.
arXiv Detail & Related papers (2023-01-01T10:57:36Z) - Fast Graph Sampling for Short Video Summarization using Gershgorin Disc
Alignment [52.577757919003844]
We study the problem of efficiently summarizing a short video into several paragraphs, leveraging recent progress in fast graph sampling.
Experimental results show that our algorithm achieves comparable video summarization as state-of-the-art methods, at a substantially reduced complexity.
arXiv Detail & Related papers (2021-10-21T18:43:00Z) - Random Subgraph Detection Using Queries [29.192695995340653]
The planted densest subgraph detection problem refers to the task of testing whether in a given (random) graph there is a subgraph that is unusually dense.
In this paper, we consider a natural variant of the above problem, where one can only observe a relatively small part of the graph using adaptive edge queries.
For this model, we determine the number of queries necessary and sufficient (accompanied with a quasi-polynomial optimal algorithm) for detecting the presence of the planted subgraph.
arXiv Detail & Related papers (2021-10-02T07:41:17Z) - Sketch-Based Anomaly Detection in Streaming Graphs [89.52200264469364]
Given a stream of graph edges from a dynamic graph, how can we assign anomaly scores to edges and subgraphs in an online manner?
Our method is the first streaming approach that incorporates dense subgraph search to detect graph anomalies in constant memory and time.
arXiv Detail & Related papers (2021-06-08T16:10:36Z) - The Generalized Mean Densest Subgraph Problem [30.33731479053404]
We introduce a new family of dense subgraph objectives, parameterized by a single parameter $p$.
Our objective captures both the standard densest subgraph problem and the maximum $k$-core as special cases.
A major contribution of our work is analyzing the performance of different types of peeling algorithms for dense subgraphs both in theory and practice.
arXiv Detail & Related papers (2021-06-02T02:58:35Z) - Random Graph Matching with Improved Noise Robustness [2.294014185517203]
We propose a new algorithm for graph matching under probabilistic models.
Our algorithm recovers the underlying matching with high probability when $alpha le 1 / (log log n)C$.
This improves the condition $alpha le 1 / (log n)C$ achieved in previous work.
arXiv Detail & Related papers (2021-01-28T02:39:27Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Online Dense Subgraph Discovery via Blurred-Graph Feedback [87.9850024070244]
We introduce a novel learning problem for dense subgraph discovery.
We first propose a edge-time algorithm that obtains a nearly-optimal solution with high probability.
We then design a more scalable algorithm with a theoretical guarantee.
arXiv Detail & Related papers (2020-06-24T11:37:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.