Related papers: Algorithm-Agnostic Explainability for Unsupervised Clustering

Algorithm-Agnostic Explainability for Unsupervised Clustering

URL: http://arxiv.org/abs/2105.08053v1
Date: Mon, 17 May 2021 17:58:55 GMT
Title: Algorithm-Agnostic Explainability for Unsupervised Clustering
Authors: Charles A. Ellis, Mohammad S.E. Sendi, Sergey M. Plis, Robyn L. Miller, and Vince D. Calhoun
Abstract summary: We present two novel algorithm-agnostic explainability methods, global permutation percent change (G2PC) feature importance and local perturbation percent change (L2PC) feature importance. We demonstrate the utility of the methods for explaining five popular clustering algorithms on low-dimensional, ground-truth synthetic datasets.
Score: 19.375627480270627
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Supervised machine learning explainability has greatly expanded in recent years. However, the field of unsupervised clustering explainability has lagged behind. Here, we, to the best of our knowledge, demonstrate for the first time how model-agnostic methods for supervised machine learning explainability can be adapted to provide algorithm-agnostic unsupervised clustering explainability. We present two novel algorithm-agnostic explainability methods, global permutation percent change (G2PC) feature importance and local perturbation percent change (L2PC) feature importance, that can provide insight into many clustering methods on a global level by identifying the relative importance of features to a clustering algorithm and on a local level by identifying the relative importance of features to the clustering of individual samples. We demonstrate the utility of the methods for explaining five popular clustering algorithms on low-dimensional, ground-truth synthetic datasets and on high-dimensional functional network connectivity (FNC) data extracted from a resting state functional magnetic resonance imaging (rs-fMRI) dataset of 151 subjects with schizophrenia (SZ) and 160 healthy controls (HC). Our proposed explainability methods robustly identify the relative importance of features across multiple clustering methods and could facilitate new insights into many applications. We hope that this study will greatly accelerate the development of the field of clustering explainability.

Related papers

Revisiting Self-Supervised Heterogeneous Graph Learning from Spectral Clustering Perspective [52.662463893268225]
Self-supervised heterogeneous graph learning (SHGL) has shown promising potential in diverse scenarios. Existing SHGL methods encounter two significant limitations. We introduce a novel framework enhanced by rank and dual consistency constraints.
arXiv Detail & Related papers (2024-12-01T09:33:20Z)
Counterfactual Explanations for Clustering Models [11.40145394568897]
Clustering algorithms rely on complex optimisation processes that may be difficult to comprehend. We propose a new, model-agnostic technique for explaining clustering algorithms with counterfactual statements.
arXiv Detail & Related papers (2024-09-19T10:05:58Z)
NeurCAM: Interpretable Neural Clustering via Additive Models [3.4437947384641037]
Interpretable clustering algorithms aim to group similar data points while explaining the obtained groups. We introduce the Neural Clustering Additive Model (NeurCAM), a novel approach to the interpretable clustering problem. Our approach significantly outperforms other interpretable clustering approaches when clustering on text data.
arXiv Detail & Related papers (2024-08-23T20:32:57Z)
Multi-View Clustering via Semi-non-negative Tensor Factorization [120.87318230985653]
We develop a novel multi-view clustering based on semi-non-negative tensor factorization (Semi-NTF) Our model directly considers the between-view relationship and exploits the between-view complementary information. In addition, we provide an optimization algorithm for the proposed method and prove mathematically that the algorithm always converges to the stationary KKT point.
arXiv Detail & Related papers (2023-03-29T14:54:19Z)
Rethinking Clustering-Based Pseudo-Labeling for Unsupervised Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling. This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data. We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z)
A Modular Framework for Centrality and Clustering in Complex Networks [0.6423239719448168]
In this paper, we study two important such network analysis techniques, namely, centrality and clustering. An information-flow based model is adopted for clustering, which itself builds upon an information theoretic measure for computing centrality. Our clustering naturally inherits the flexibility to accommodate edge directionality, as well as different interpretations and interplay between edge weights and node degrees.
arXiv Detail & Related papers (2021-11-23T03:01:29Z)
Deep Attention-guided Graph Clustering with Dual Self-supervision [49.040136530379094]
We propose a novel method, namely deep attention-guided graph clustering with dual self-supervision (DAGC) We develop a dual self-supervision solution consisting of a soft self-supervision strategy with a triplet Kullback-Leibler divergence loss and a hard self-supervision strategy with a pseudo supervision loss. Our method consistently outperforms state-of-the-art methods on six benchmark datasets.
arXiv Detail & Related papers (2021-11-10T06:53:03Z)
Fast and Interpretable Consensus Clustering via Minipatch Learning [0.0]
We develop IMPACC: Interpretable MiniPatch Adaptive Consensus Clustering. We develop adaptive sampling schemes for observations, which result in both improved reliability and computational savings. Results show that our approach yields more accurate and interpretable cluster solutions.
arXiv Detail & Related papers (2021-10-05T22:39:28Z)
Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process. Our method significantly reduces the required number of interactions compared with random intervention targeting. We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z)
Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed. We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z)
A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference. Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)
A semi-supervised sparse K-Means algorithm [3.04585143845864]
An unsupervised sparse clustering method can be employed in order to detect the subgroup of features necessary for clustering. A semi-supervised method can use the labelled data to create constraints and enhance the clustering solution. We show that the algorithm maintains the high performance of other semi-supervised algorithms and in addition preserves the ability to identify informative from uninformative features.
arXiv Detail & Related papers (2020-03-16T02:05:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.