Interpretable Clustering via Multi-Polytope Machines
- URL: http://arxiv.org/abs/2112.05653v1
- Date: Fri, 10 Dec 2021 16:36:32 GMT
- Title: Interpretable Clustering via Multi-Polytope Machines
- Authors: Connor Lawless, Jayant Kalagnanam, Lam M. Nguyen, Dzung Phan, Chandra
Reddy
- Abstract summary: We propose a novel approach for interpretable clustering that both clusters data points and constructs polytopes around the discovered clusters to explain them.
We benchmark our approach on a suite of synthetic and real world clustering problems, where our algorithm outperforms state of the art interpretable and non-interpretable clustering algorithms.
- Score: 12.69310440882225
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Clustering is a popular unsupervised learning tool often used to discover
groups within a larger population such as customer segments, or patient
subtypes. However, despite its use as a tool for subgroup discovery and
description - few state-of-the-art algorithms provide any rationale or
description behind the clusters found. We propose a novel approach for
interpretable clustering that both clusters data points and constructs
polytopes around the discovered clusters to explain them. Our framework allows
for additional constraints on the polytopes - including ensuring that the
hyperplanes constructing the polytope are axis-parallel or sparse with integer
coefficients. We formulate the problem of constructing clusters via polytopes
as a Mixed-Integer Non-Linear Program (MINLP). To solve our formulation we
propose a two phase approach where we first initialize clusters and polytopes
using alternating minimization, and then use coordinate descent to boost
clustering performance. We benchmark our approach on a suite of synthetic and
real world clustering problems, where our algorithm outperforms state of the
art interpretable and non-interpretable clustering algorithms.
Related papers
- Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Convex Clustering through MM: An Efficient Algorithm to Perform
Hierarchical Clustering [1.0589208420411012]
We propose convex clustering through majorization-minimization ( CCMM) -- an iterative algorithm that uses cluster fusions and a highly efficient updating scheme.
With a current desktop computer, CCMM efficiently solves convex clustering problems featuring over one million objects in seven-dimensional space.
arXiv Detail & Related papers (2022-11-03T15:07:51Z) - Cluster Explanation via Polyhedral Descriptions [0.0]
Clustering is an unsupervised learning problem that aims to partition unlabelled data points into groups with similar features.
Traditional clustering algorithms provide limited insight into the groups they find as their main focus is accuracy and not the interpretability of the group assignments.
We introduce a new approach to explain clusters by constructing polyhedra around each cluster while minimizing either the complexity of the resulting polyhedra or the number of features used in the description.
arXiv Detail & Related papers (2022-10-17T07:26:44Z) - DeepCluE: Enhanced Image Clustering via Multi-layer Ensembles in Deep
Neural Networks [53.88811980967342]
This paper presents a Deep Clustering via Ensembles (DeepCluE) approach.
It bridges the gap between deep clustering and ensemble clustering by harnessing the power of multiple layers in deep neural networks.
Experimental results on six image datasets confirm the advantages of DeepCluE over the state-of-the-art deep clustering approaches.
arXiv Detail & Related papers (2022-06-01T09:51:38Z) - Natural Hierarchical Cluster Analysis by Nearest Neighbors with
Near-Linear Time Complexity [0.0]
We propose a nearest neighbor based clustering algorithm that results in a naturally defined hierarchy of clusters.
In contrast to the agglomerative and divisive hierarchical clustering algorithms, our approach is not dependent on the iterative working of the algorithm.
arXiv Detail & Related papers (2022-03-15T16:03:42Z) - Fast and explainable clustering based on sorting [0.0]
We introduce a fast and explainable clustering method called CLASSIX.
The algorithm is controlled by two scalar parameters, namely a distance parameter for the aggregation and another parameter controlling the minimal cluster size.
Our experiments demonstrate that CLASSIX competes with state-of-the-art clustering algorithms.
arXiv Detail & Related papers (2022-02-03T08:24:21Z) - Lattice-Based Methods Surpass Sum-of-Squares in Clustering [98.46302040220395]
Clustering is a fundamental primitive in unsupervised learning.
Recent work has established lower bounds against the class of low-degree methods.
We show that, perhaps surprisingly, this particular clustering model textitdoes not exhibit a statistical-to-computational gap.
arXiv Detail & Related papers (2021-12-07T18:50:17Z) - Clustering Ensemble Meets Low-rank Tensor Approximation [50.21581880045667]
This paper explores the problem of clustering ensemble, which aims to combine multiple base clusterings to produce better performance than that of the individual one.
We propose a novel low-rank tensor approximation-based method to solve the problem from a global perspective.
Experimental results over 7 benchmark data sets show that the proposed model achieves a breakthrough in clustering performance, compared with 12 state-of-the-art methods.
arXiv Detail & Related papers (2020-12-16T13:01:37Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Unsupervised Multi-view Clustering by Squeezing Hybrid Knowledge from
Cross View and Each View [68.88732535086338]
This paper proposes a new multi-view clustering method, low-rank subspace multi-view clustering based on adaptive graph regularization.
Experimental results for five widely used multi-view benchmarks show that our proposed algorithm surpasses other state-of-the-art methods by a clear margin.
arXiv Detail & Related papers (2020-08-23T08:25:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.