Implicit models, latent compression, intrinsic biases, and cheap lunches
in community detection
- URL: http://arxiv.org/abs/2210.09186v7
- Date: Tue, 7 Nov 2023 21:14:31 GMT
- Title: Implicit models, latent compression, intrinsic biases, and cheap lunches
in community detection
- Authors: Tiago P. Peixoto, Alec Kirkley
- Abstract summary: Community detection aims to partition a network into clusters of nodes to summarize its large-scale structure.
Some community detection methods are inferential, explicitly deriving the clustering objective through a probabilistic generative model.
Other methods are descriptive, dividing a network according to an objective motivated by a particular application.
We present a solution that associates any community detection objective, inferential or descriptive, with its corresponding implicit network generative model.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The task of community detection, which aims to partition a network into
clusters of nodes to summarize its large-scale structure, has spawned the
development of many competing algorithms with varying objectives. Some
community detection methods are inferential, explicitly deriving the clustering
objective through a probabilistic generative model, while other methods are
descriptive, dividing a network according to an objective motivated by a
particular application, making it challenging to compare these methods on the
same scale. Here we present a solution to this problem that associates any
community detection objective, inferential or descriptive, with its
corresponding implicit network generative model. This allows us to compute the
description length of a network and its partition under arbitrary objectives,
providing a principled measure to compare the performance of different
algorithms without the need for "ground truth" labels. Our approach also gives
access to instances of the community detection problem that are optimal to any
given algorithm, and in this way reveals intrinsic biases in popular
descriptive methods, explaining their tendency to overfit. Using our framework,
we compare a number of community detection methods on artificial networks, and
on a corpus of over 500 structurally diverse empirical networks. We find that
more expressive community detection methods exhibit consistently superior
compression performance on structured data instances, without having degraded
performance on a minority of situations where more specialized algorithms
perform optimally. Our results undermine the implications of the "no free
lunch" theorem for community detection, both conceptually and in practice,
since it is confined to unstructured data instances, unlike relevant community
detection problems which are structured by requirement.
Related papers
- Enhancing Community Detection in Networks: A Comparative Analysis of Local Metrics and Hierarchical Algorithms [49.1574468325115]
This study employs the same method to evaluate the relevance of using local similarity metrics for community detection.
The efficacy of these metrics was evaluated by applying the base algorithm to several real networks with varying community sizes.
arXiv Detail & Related papers (2024-08-17T02:17:09Z) - Sifting out communities in large sparse networks [2.666294200266662]
We introduce an intuitive objective function for quantifying the quality of clustering results in large sparse networks.
We utilize a two-step method for identifying communities which is especially well-suited for this domain.
We identify complex genetic interactions in large-scale networks comprised of tens of thousands of nodes.
arXiv Detail & Related papers (2024-05-01T18:57:41Z) - Semi-supervised Community Detection via Structural Similarity Metrics [0.0]
We study a semi-supervised community detection problem in which the objective is to estimate the community label of a new node.
We propose an algorithm that computes a structural similarity metric' between the new node and each of the $K$ communities.
Our findings highlight, to the best of our knowledge, the first semi-supervised community detection algorithm that offers theoretical guarantees.
arXiv Detail & Related papers (2023-06-01T19:02:50Z) - Bayesian community detection for networks with covariates [16.230648949593153]
"Community detection" has arguably received the most attention in the scientific community.
We propose a block model with a co-dependent random partition prior.
Our model has the ability to learn the number of the communities via posterior inference without having to assume it to be known.
arXiv Detail & Related papers (2022-03-04T01:58:35Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - Learning to Detect Critical Nodes in Sparse Graphs via Feature Importance Awareness [53.351863569314794]
The critical node problem (CNP) aims to find a set of critical nodes from a network whose deletion maximally degrades the pairwise connectivity of the residual network.
This work proposes a feature importance-aware graph attention network for node representation.
It combines it with dueling double deep Q-network to create an end-to-end algorithm to solve CNP for the first time.
arXiv Detail & Related papers (2021-12-03T14:23:05Z) - Descriptive vs. inferential community detection in networks: pitfalls,
myths, and half-truths [0.0]
We argue that inferential methods are more typically aligned with clearer scientific questions, yield more robust results, and should be in many cases preferred.
We attempt to dispel some myths and half-truths often believed when community detection is employed in practice, in an effort to improve both the use of such methods as well as the interpretation of their results.
arXiv Detail & Related papers (2021-11-30T23:57:51Z) - Unsupervised Domain-adaptive Hash for Networks [81.49184987430333]
Domain-adaptive hash learning has enjoyed considerable success in the computer vision community.
We develop an unsupervised domain-adaptive hash learning method for networks, dubbed UDAH.
arXiv Detail & Related papers (2021-08-20T12:09:38Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - On the use of local structural properties for improving the efficiency
of hierarchical community detection methods [77.34726150561087]
We study how local structural network properties can be used as proxies to improve the efficiency of hierarchical community detection.
We also check the performance impact of network prunings as an ancillary tactic to make hierarchical community detection more efficient.
arXiv Detail & Related papers (2020-09-15T00:16:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.