Efficient Top-k s-Biplexes Search over Large Bipartite Graphs
- URL: http://arxiv.org/abs/2409.18473v1
- Date: Fri, 27 Sep 2024 06:23:29 GMT
- Title: Efficient Top-k s-Biplexes Search over Large Bipartite Graphs
- Authors: Zhenxiang Xu, Yiping Liu, Yi Zhou, Yimin Hao, Zhengren Wang,
- Abstract summary: In bipartite graph analysis, enumeration of $s$-biplexes is a fundamental problem.
In real-world data engineering, finding all $s-biplexes is neither necessary nor computationally affordable.
We propose MVBP, that breaks the simple $2n enumeration algorithm.
FastMVBP outperforms the benchmark algorithms by up to three orders of magnitude in several instances.
- Score: 4.507351209412066
- License:
- Abstract: In a bipartite graph, a subgraph is an $s$-biplex if each vertex of the subgraph is adjacent to all but at most $s$ vertices on the opposite set. The enumeration of $s$-biplexes from a given graph is a fundamental problem in bipartite graph analysis. However, in real-world data engineering, finding all $s$-biplexes is neither necessary nor computationally affordable. A more realistic problem is to identify some of the largest $s$-biplexes from the large input graph. We formulate the problem as the {\em top-$k$ $s$-biplex search (TBS) problem}, which aims to find the top-$k$ maximal $s$-biplexes with the most vertices, where $k$ is an input parameter. We prove that the TBS problem is NP-hard for any fixed $k\ge 1$. Then, we propose a branching algorithm, named MVBP, that breaks the simple $2^n$ enumeration algorithm. Furthermore, from a practical perspective, we investigate three techniques to improve the performance of MVBP: 2-hop decomposition, single-side bounds, and progressive search. Complexity analysis shows that the improved algorithm, named FastMVBP, has a running time $O^*(\gamma_s^{d_2})$, where $\gamma_s<2$, and $d_2$ is a parameter much smaller than the number of vertex in the sparse real-world graphs, e.g. $d_2$ is only $67$ in the AmazonRatings dataset which has more than $3$ million vertices. Finally, we conducted extensive experiments on eight real-world and synthetic datasets to demonstrate the empirical efficiency of the proposed algorithms. In particular, FastMVBP outperforms the benchmark algorithms by up to three orders of magnitude in several instances.
Related papers
- Graph Unfolding and Sampling for Transitory Video Summarization via Gershgorin Disc Alignment [48.137527345353625]
User-generated videos (UGVs) uploaded from mobile phones to social media sites like YouTube and TikTok are short and non-repetitive.
We summarize a transitory UGV into several discs in linear time via fast graph sampling based on Gershgorin disc alignment (GDA)
We show that our algorithm achieves comparable or better video summarization performance compared to state-of-the-art methods, at a substantially reduced complexity.
arXiv Detail & Related papers (2024-08-03T20:08:02Z) - The Computational Complexity of Finding Stationary Points in Non-Convex Optimization [53.86485757442486]
Finding approximate stationary points, i.e., points where the gradient is approximately zero, of non-order but smooth objective functions is a computational problem.
We show that finding approximate KKT points in constrained optimization is reducible to finding approximate stationary points in unconstrained optimization but the converse is impossible.
arXiv Detail & Related papers (2023-10-13T14:52:46Z) - Fast Maximum $k$-Plex Algorithms Parameterized by Small Degeneracy Gaps [30.06993761032835]
The maximum $k$-plex problem is important but computationally challenging in applications such as graph mining and community detection.
We present an exact algorithm parameterized by $g_k(G)$, which has the worst-case running time in the size of the input graph and exponential in $g_k(G)$.
We further extend our discussion to an even smaller parameter $cg_k(G)$, the gap between the community-degeneracy bound and the size of the maximum $k$-plex.
arXiv Detail & Related papers (2023-06-23T01:28:24Z) - Correlation Clustering Algorithm for Dynamic Complete Signed Graphs: An
Index-based Approach [9.13755431537592]
In this paper, we reduce the complexity of approximating the correlation clustering problem from $O(mtimesleft( 2+ alpha (G) right)+n)$ to $O(m+n)$ for any given value of $varepsilon$ for a complete signed graph.
Our approach gives the same output as the original algorithm and makes it possible to implement the algorithm in a full dynamic setting.
arXiv Detail & Related papers (2023-01-01T10:57:36Z) - Listing Maximal k-Plexes in Large Real-World Graphs [21.79007529359561]
Listing dense subgraphs in large graphs plays a key task in varieties of network analysis applications like community detection.
In this paper, we continue the research line of listing all maximal $k$-plexes and maximal $k$-plexes of prescribed size.
Our first contribution is algorithm ListPlex that lists all maximal $k$-plexes in $O*(gammaD)$ time for each constant $k$, where $gamma$ is a value related to $k$ but strictly smaller than 2, and $D$ is the degeneracy of the graph that is far less
arXiv Detail & Related papers (2022-02-17T16:25:56Z) - Fast Graph Sampling for Short Video Summarization using Gershgorin Disc
Alignment [52.577757919003844]
We study the problem of efficiently summarizing a short video into several paragraphs, leveraging recent progress in fast graph sampling.
Experimental results show that our algorithm achieves comparable video summarization as state-of-the-art methods, at a substantially reduced complexity.
arXiv Detail & Related papers (2021-10-21T18:43:00Z) - Classical algorithms for Forrelation [2.624902795082451]
We study the forrelation problem: given a pair of $n$-bit Boolean functions $f$ and $g$, estimate the correlation between $f$ and the Fourier transform of $g$.
This problem is known to provide the largest possible quantum speedup in terms of its query complexity.
We show that the graph-based forrelation problem can be solved on a classical computer in time $O(n)$ for any bipartite graph.
arXiv Detail & Related papers (2021-02-13T17:25:41Z) - Hybrid Stochastic-Deterministic Minibatch Proximal Gradient:
Less-Than-Single-Pass Optimization with Nearly Optimal Generalization [83.80460802169999]
We show that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
For loss factors, we prove that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
arXiv Detail & Related papers (2020-09-18T02:18:44Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Fixed-Support Wasserstein Barycenters: Computational Hardness and Fast
Algorithm [100.11971836788437]
We study the fixed-support Wasserstein barycenter problem (FS-WBP)
We develop a provably fast textitdeterministic variant of the celebrated iterative Bregman projection (IBP) algorithm, named textscFastIBP.
arXiv Detail & Related papers (2020-02-12T03:40:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.