Robust Hierarchical Clustering for Directed Networks: An Axiomatic
Approach
- URL: http://arxiv.org/abs/2108.07247v1
- Date: Mon, 16 Aug 2021 17:28:21 GMT
- Title: Robust Hierarchical Clustering for Directed Networks: An Axiomatic
Approach
- Authors: Gunnar Carlsson, Facundo M\'emoli, Santiago Segarra
- Abstract summary: We provide a complete taxonomic characterization of robust hierarchical clustering methods for directed networks.
We introduce three practical properties associated with robustness in hierarchical clustering: linear scale preservation, stability, and excisiveness.
We also address the implementation of our methods and describe an application to real data.
- Score: 13.406858660972551
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We provide a complete taxonomic characterization of robust hierarchical
clustering methods for directed networks following an axiomatic approach. We
begin by introducing three practical properties associated with the notion of
robustness in hierarchical clustering: linear scale preservation, stability,
and excisiveness. Linear scale preservation enforces imperviousness to change
in units of measure whereas stability ensures that a bounded perturbation in
the input network entails a bounded perturbation in the clustering output.
Excisiveness refers to the local consistency of the clustering outcome.
Algorithmically, excisiveness implies that we can reduce computational
complexity by only clustering a subset of our data while theoretically
guaranteeing that the same hierarchical outcome would be observed when
clustering the whole dataset. In parallel to these three properties, we
introduce the concept of representability, a generative model for describing
clustering methods through the specification of their action on a collection of
networks. Our main result is to leverage this generative model to give a
precise characterization of all robust -- i.e., excisive, linear scale
preserving, and stable -- hierarchical clustering methods for directed
networks. We also address the implementation of our methods and describe an
application to real data.
Related papers
- CoHiRF: A Scalable and Interpretable Clustering Framework for High-Dimensional Data [0.30723404270319693]
We propose Consensus Hierarchical Random Feature (CoHiRF), a novel clustering method designed to address challenges effectively.
CoHiRF leverages random feature selection to mitigate noise and dimensionality effects, repeatedly applies K-Means clustering in reduced feature spaces, and combines results through a unanimous consensus criterion.
CoHiRF is computationally efficient with a running time comparable to K-Means, scalable to massive datasets, and exhibits robust performance against state-of-the-art methods such as SC-SRGF, HDBSCAN, and OPTICS.
arXiv Detail & Related papers (2025-02-01T09:38:44Z) - Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - GCC: Generative Calibration Clustering [55.44944397168619]
We propose a novel Generative Clustering (GCC) method to incorporate feature learning and augmentation into clustering procedure.
First, we develop a discrimirative feature alignment mechanism to discover intrinsic relationship across real and generated samples.
Second, we design a self-supervised metric learning to generate more reliable cluster assignment.
arXiv Detail & Related papers (2024-04-14T01:51:11Z) - Unfolding ADMM for Enhanced Subspace Clustering of Hyperspectral Images [43.152314090830174]
We introduce an innovative clustering architecture for hyperspectral images (HSI) by unfolding an iterative solver based on the Alternating Direction Method of Multipliers (ADMM) for sparse subspace clustering.
Our approach captures well the structural characteristics of HSI data by employing the K nearest neighbors algorithm as part of a structure preservation module.
arXiv Detail & Related papers (2024-04-10T15:51:46Z) - Deep Embedding Clustering Driven by Sample Stability [16.53706617383543]
We propose a deep embedding clustering algorithm driven by sample stability (DECS)
Specifically, we start by constructing the initial feature space with an autoencoder and then learn the cluster-oriented embedding feature constrained by sample stability.
The experimental results on five datasets illustrate that the proposed method achieves superior performance compared to state-of-the-art clustering approaches.
arXiv Detail & Related papers (2024-01-29T09:19:49Z) - A Generalized Framework for Predictive Clustering and Optimization [18.06697544912383]
Clustering is a powerful and extensively used data science tool.
In this article, we define a generalized optimization framework for predictive clustering.
We also present a joint optimization strategy that exploits mixed-integer linear programming (MILP) for global optimization.
arXiv Detail & Related papers (2023-05-07T19:56:51Z) - Perfect Spectral Clustering with Discrete Covariates [68.8204255655161]
We propose a spectral algorithm that achieves perfect clustering with high probability on a class of large, sparse networks.
Our method is the first to offer a guarantee of consistent latent structure recovery using spectral clustering.
arXiv Detail & Related papers (2022-05-17T01:41:06Z) - Learning the Precise Feature for Cluster Assignment [39.320210567860485]
We propose a framework which integrates representation learning and clustering into a single pipeline for the first time.
The proposed framework exploits the powerful ability of recently developed generative models for learning intrinsic features.
Experimental results show that the performance of the proposed method is superior, or at least comparable to, the state-of-the-art methods.
arXiv Detail & Related papers (2021-06-11T04:08:54Z) - Towards Uncovering the Intrinsic Data Structures for Unsupervised Domain
Adaptation using Structurally Regularized Deep Clustering [119.88565565454378]
Unsupervised domain adaptation (UDA) is to learn classification models that make predictions for unlabeled data on a target domain.
We propose a hybrid model of Structurally Regularized Deep Clustering, which integrates the regularized discriminative clustering of target data with a generative one.
Our proposed H-SRDC outperforms all the existing methods under both the inductive and transductive settings.
arXiv Detail & Related papers (2020-12-08T08:52:00Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Structured Graph Learning for Clustering and Semi-supervised
Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data.
Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure.
Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.