Related papers: Explainable Clustering via Exemplars: Complexity and Efficient Approximation Algorithms

Explainable Clustering via Exemplars: Complexity and Efficient Approximation Algorithms

URL: http://arxiv.org/abs/2209.09670v1
Date: Tue, 20 Sep 2022 12:09:51 GMT
Title: Explainable Clustering via Exemplars: Complexity and Efficient Approximation Algorithms
Authors: Ian Davidson, Michael Livanos, Antoine Gourru, Peter Walker, Julien Velcin and S. S. Ravi
Abstract summary: We propose an explainable-by-design clustering approach that finds clusters and exemplars to explain each cluster. The use of exemplars for understanding is supported by the exemplar-based school of concept definition in psychology. We show that finding a small set of exemplars to explain even a single cluster is computationally intractable.
Score: 30.369731369945296
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Explainable AI (XAI) is an important developing area but remains relatively understudied for clustering. We propose an explainable-by-design clustering approach that not only finds clusters but also exemplars to explain each cluster. The use of exemplars for understanding is supported by the exemplar-based school of concept definition in psychology. We show that finding a small set of exemplars to explain even a single cluster is computationally intractable; hence, the overall problem is challenging. We develop an approximation algorithm that provides provable performance guarantees with respect to clustering quality as well as the number of exemplars used. This basic algorithm explains all the instances in every cluster whilst another approximation algorithm uses a bounded number of exemplars to allow simpler explanations and provably covers a large fraction of all the instances. Experimental results show that our work is useful in domains involving difficult to understand deep embeddings of images and text.

Related papers

Cluster Specific Representation Learning [1.6727186769396276]
Despite its widespread application, there is no established definition of a good'' representation. We propose a downstream-agnostic formulation: when inherent clusters exist in the data, the representations should be specific to each cluster. Under this idea, we develop a meta-algorithm that jointly learns cluster-specific representations and cluster assignments.
arXiv Detail & Related papers (2024-12-04T16:59:37Z)
Revisiting Self-Supervised Heterogeneous Graph Learning from Spectral Clustering Perspective [52.662463893268225]
Self-supervised heterogeneous graph learning (SHGL) has shown promising potential in diverse scenarios. Existing SHGL methods encounter two significant limitations. We introduce a novel framework enhanced by rank and dual consistency constraints.
arXiv Detail & Related papers (2024-12-01T09:33:20Z)
Stable Cluster Discrimination for Deep Clustering [7.175082696240088]
Deep clustering can optimize representations of instances (i.e., representation learning) and explore the inherent data distribution. The coupled objective implies a trivial solution that all instances collapse to the uniform features. In this work, we first show that the prevalent discrimination task in supervised learning is unstable for one-stage clustering. A novel stable cluster discrimination (SeCu) task is proposed and a new hardness-aware clustering criterion can be obtained accordingly.
arXiv Detail & Related papers (2023-11-24T06:43:26Z)
Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering. In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework. In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z)
Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels. We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z)
Differentially-Private Clustering of Easy Instances [67.04951703461657]
In differentially private clustering, the goal is to identify $k$ cluster centers without disclosing information on individual data points. We provide implementable differentially private clustering algorithms that provide utility when the data is "easy" We propose a framework that allows us to apply non-private clustering algorithms to the easy instances and privately combine the results.
arXiv Detail & Related papers (2021-12-29T08:13:56Z)
KnAC: an approach for enhancing cluster analysis with background knowledge and explanations [0.20999222360659603]
We present Knowledge Augmented Clustering (KnAC), which main goal is to confront expert-based labelling with automated clustering. KnAC can serve as an augmentation of an arbitrary clustering algorithm, making the approach robust and model-agnostic.
arXiv Detail & Related papers (2021-12-16T10:13:47Z)
Lattice-Based Methods Surpass Sum-of-Squares in Clustering [98.46302040220395]
Clustering is a fundamental primitive in unsupervised learning. Recent work has established lower bounds against the class of low-degree methods. We show that, perhaps surprisingly, this particular clustering model textitdoes not exhibit a statistical-to-computational gap.
arXiv Detail & Related papers (2021-12-07T18:50:17Z)
Weighted Sparse Subspace Representation: A Unified Framework for Subspace Clustering, Constrained Clustering, and Active Learning [0.3553493344868413]
We first propose a novel spectral-based subspace clustering algorithm that seeks to represent each point as a sparse convex combination of a few nearby points. We then extend the algorithm to constrained clustering and active learning settings. Our motivation for developing such a framework stems from the fact that typically either a small amount of labelled data is available in advance; or it is possible to label some points at a cost.
arXiv Detail & Related papers (2021-06-08T13:39:43Z)
Graph Contrastive Clustering [131.67881457114316]
We propose a novel graph contrastive learning framework, which is then applied to the clustering task and we come up with the Graph Constrastive Clustering(GCC) method. Specifically, on the one hand, the graph Laplacian based contrastive loss is proposed to learn more discriminative and clustering-friendly features. On the other hand, a novel graph-based contrastive learning strategy is proposed to learn more compact clustering assignments.
arXiv Detail & Related papers (2021-04-03T15:32:49Z)
Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed. We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z)
Simple and Scalable Sparse k-means Clustering via Feature Ranking [14.839931533868176]
We propose a novel framework for sparse k-means clustering that is intuitive, simple to implement, and competitive with state-of-the-art algorithms. Our core method readily generalizes to several task-specific algorithms such as clustering on subsets of attributes and in partially observed data settings.
arXiv Detail & Related papers (2020-02-20T02:41:02Z)
Efficient Algorithms for Generating Provably Near-Optimal Cluster Descriptors for Explainability [36.11663695534294]
We study the problem of making clusters more interpretable by extending a recent approach of constructing succinct representations for clusters. We develop approximation algorithms with provable performance guarantees for the problem. We also show applications to explain clusters from datasets, including clusters of genomic sequences that represent different threat levels.
arXiv Detail & Related papers (2020-02-06T19:49:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.