Descriptive vs. inferential community detection in networks: pitfalls,
myths, and half-truths
- URL: http://arxiv.org/abs/2112.00183v7
- Date: Thu, 6 Jul 2023 12:20:05 GMT
- Title: Descriptive vs. inferential community detection in networks: pitfalls,
myths, and half-truths
- Authors: Tiago P. Peixoto
- Abstract summary: We argue that inferential methods are more typically aligned with clearer scientific questions, yield more robust results, and should be in many cases preferred.
We attempt to dispel some myths and half-truths often believed when community detection is employed in practice, in an effort to improve both the use of such methods as well as the interpretation of their results.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Community detection is one of the most important methodological fields of
network science, and one which has attracted a significant amount of attention
over the past decades. This area deals with the automated division of a network
into fundamental building blocks, with the objective of providing a summary of
its large-scale structure. Despite its importance and widespread adoption,
there is a noticeable gap between what is arguably the state-of-the-art and the
methods that are actually used in practice in a variety of fields. Here we
attempt to address this discrepancy by dividing existing methods according to
whether they have a "descriptive" or an "inferential" goal. While descriptive
methods find patterns in networks based on context-dependent notions of
community structure, inferential methods articulate generative models, and
attempt to fit them to data. In this way, they are able to provide insights
into the mechanisms of network formation, and separate structure from
randomness in a manner supported by statistical evidence. We review how
employing descriptive methods with inferential aims is riddled with pitfalls
and misleading answers, and thus should be in general avoided. We argue that
inferential methods are more typically aligned with clearer scientific
questions, yield more robust results, and should be in many cases preferred. We
attempt to dispel some myths and half-truths often believed when community
detection is employed in practice, in an effort to improve both the use of such
methods as well as the interpretation of their results.
Related papers
- Out-of-Distribution Detection on Graphs: A Survey [58.47395497985277]
Graph out-of-distribution (GOOD) detection focuses on identifying graph data that deviates from the distribution seen during training.
We categorize existing methods into four types: enhancement-based, reconstruction-based, information propagation-based, and classification-based approaches.
We discuss practical applications and theoretical foundations, highlighting the unique challenges posed by graph data.
arXiv Detail & Related papers (2025-02-12T04:07:12Z) - NLP Verification: Towards a General Methodology for Certifying Robustness [9.897538432223714]
Machine Learning (ML) has exhibited substantial success in the field of Natural Language Processing (NLP)
As these systems are increasingly integrated into real-world applications, ensuring their safety and reliability becomes a primary concern.
We propose a general methodology to analyse the effect of the embedding gap, a problem that refers to the discrepancy between verification of geometric subspaces and the semantic meaning of sentences.
arXiv Detail & Related papers (2024-03-15T09:43:52Z) - DARE: Towards Robust Text Explanations in Biomedical and Healthcare
Applications [54.93807822347193]
We show how to adapt attribution robustness estimation methods to a given domain, so as to take into account domain-specific plausibility.
Next, we provide two methods, adversarial training and FAR training, to mitigate the brittleness characterized by DARE.
Finally, we empirically validate our methods with extensive experiments on three established biomedical benchmarks.
arXiv Detail & Related papers (2023-07-05T08:11:40Z) - From Patches to Objects: Exploiting Spatial Reasoning for Better Visual
Representations [2.363388546004777]
We propose a novel auxiliary pretraining method that is based on spatial reasoning.
Our proposed method takes advantage of a more flexible formulation of contrastive learning by introducing spatial reasoning as an auxiliary task for discriminative self-supervised methods.
arXiv Detail & Related papers (2023-05-21T07:46:46Z) - Implicit models, latent compression, intrinsic biases, and cheap lunches
in community detection [0.0]
Community detection aims to partition a network into clusters of nodes to summarize its large-scale structure.
Some community detection methods are inferential, explicitly deriving the clustering objective through a probabilistic generative model.
Other methods are descriptive, dividing a network according to an objective motivated by a particular application.
We present a solution that associates any community detection objective, inferential or descriptive, with its corresponding implicit network generative model.
arXiv Detail & Related papers (2022-10-17T15:38:41Z) - Time to Focus: A Comprehensive Benchmark Using Time Series Attribution
Methods [4.9449660544238085]
The paper focuses on time series analysis and benchmark several state-of-the-art attribution methods.
The presented experiments involve gradient-based and perturbation-based attribution methods.
The findings accentuate that choosing the best-suited attribution method is strongly correlated with the desired use case.
arXiv Detail & Related papers (2022-02-08T10:06:13Z) - Unsupervised Domain-adaptive Hash for Networks [81.49184987430333]
Domain-adaptive hash learning has enjoyed considerable success in the computer vision community.
We develop an unsupervised domain-adaptive hash learning method for networks, dubbed UDAH.
arXiv Detail & Related papers (2021-08-20T12:09:38Z) - Triggering Failures: Out-Of-Distribution detection by learning from
local adversarial attacks in Semantic Segmentation [76.2621758731288]
We tackle the detection of out-of-distribution (OOD) objects in semantic segmentation.
Our main contribution is a new OOD detection architecture called ObsNet associated with a dedicated training scheme based on Local Adversarial Attacks (LAA)
We show it obtains top performances both in speed and accuracy when compared to ten recent methods of the literature on three different datasets.
arXiv Detail & Related papers (2021-08-03T17:09:56Z) - A Comprehensive Survey on Community Detection with Deep Learning [93.40332347374712]
A community reveals the features and connections of its members that are different from those in other communities in a network.
This survey devises and proposes a new taxonomy covering different categories of the state-of-the-art methods.
The main category, i.e., deep neural networks, is further divided into convolutional networks, graph attention networks, generative adversarial networks and autoencoders.
arXiv Detail & Related papers (2021-05-26T14:37:07Z) - An Effective Baseline for Robustness to Distributional Shift [5.627346969563955]
Refraining from confidently predicting when faced with categories of inputs different from those seen during training is an important requirement for the safe deployment of deep learning systems.
We present a simple, but highly effective approach to deal with out-of-distribution detection that uses the principle of abstention.
arXiv Detail & Related papers (2021-05-15T00:46:11Z) - A Survey of Community Detection Approaches: From Statistical Modeling to
Deep Learning [95.27249880156256]
We develop and present a unified architecture of network community-finding methods.
We introduce a new taxonomy that divides the existing methods into two categories, namely probabilistic graphical model and deep learning.
We conclude with discussions of the challenges of the field and suggestions of possible directions for future research.
arXiv Detail & Related papers (2021-01-03T02:32:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.