Related papers: Deep Descriptive Clustering

Deep Descriptive Clustering

URL: http://arxiv.org/abs/2105.11549v1
Date: Mon, 24 May 2021 21:40:16 GMT
Title: Deep Descriptive Clustering
Authors: Hongjing Zhang, Ian Davidson
Abstract summary: This paper explores a novel setting for performing clustering on complex data while simultaneously generating explanations using interpretable tags. We form good clusters by maximizing the mutual information between empirical distribution on the inputs and the induced clustering labels for clustering objectives. Experimental results on public data demonstrate that our model outperforms competitive baselines in clustering performance.
Score: 24.237000220172906
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent work on explainable clustering allows describing clusters when the features are interpretable. However, much modern machine learning focuses on complex data such as images, text, and graphs where deep learning is used but the raw features of data are not interpretable. This paper explores a novel setting for performing clustering on complex data while simultaneously generating explanations using interpretable tags. We propose deep descriptive clustering that performs sub-symbolic representation learning on complex data while generating explanations based on symbolic data. We form good clusters by maximizing the mutual information between empirical distribution on the inputs and the induced clustering labels for clustering objectives. We generate explanations by solving an integer linear programming that generates concise and orthogonal descriptions for each cluster. Finally, we allow the explanation to inform better clustering by proposing a novel pairwise loss with self-generated constraints to maximize the clustering and explanation module's consistency. Experimental results on public data demonstrate that our model outperforms competitive baselines in clustering performance while offering high-quality cluster-level explanations.

Related papers

Hierarchical clustering with maximum density paths and mixture models [39.42511559155036]
Hierarchical clustering is an effective and interpretable technique for analyzing structure in data. It is particularly helpful in settings where the exact number of clusters is unknown, and provides a robust framework for exploring complex datasets. Our method addresses this limitation by leveraging a two-stage approach, first employing a Gaussian or Student's t mixture model to overcluster the data, and then hierarchically merging clusters based on the induced density landscape. This approach yields state-of-the-art clustering performance while also providing a meaningful hierarchy, making it a valuable tool for exploratory data analysis.
arXiv Detail & Related papers (2025-03-19T15:37:51Z)
Order is All You Need for Categorical Data Clustering [31.851890008893847]
This paper introduces a new finding that the order relation among attribute values is the decisive factor in clustering accuracy. We propose a new learning paradigm that allows joint learning of clusters and the orders. The algorithm achieves superior clustering accuracy with a convergence guarantee.
arXiv Detail & Related papers (2024-11-19T08:23:25Z)
NeurCAM: Interpretable Neural Clustering via Additive Models [3.4437947384641037]
Interpretable clustering algorithms aim to group similar data points while explaining the obtained groups. We introduce the Neural Clustering Additive Model (NeurCAM), a novel approach to the interpretable clustering problem. Our approach significantly outperforms other interpretable clustering approaches when clustering on text data.
arXiv Detail & Related papers (2024-08-23T20:32:57Z)
Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering. In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework. In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z)
Using Decision Trees for Interpretable Supervised Clustering [0.0]
supervised clustering aims at forming clusters of labelled data with high probability densities. We are particularly interested in finding clusters of data of a given class and describing the clusters with the set of comprehensive rules.
arXiv Detail & Related papers (2023-07-16T17:12:45Z)
Interpretable Deep Clustering for Tabular Data [7.972599673048582]
Clustering is a fundamental learning task widely used in data analysis. We propose a new deep-learning framework that predicts interpretable cluster assignments at the instance and cluster levels. We show that the proposed method can reliably predict cluster assignments in biological, text, image, and physics datasets.
arXiv Detail & Related papers (2023-06-07T21:08:09Z)
Goal-Driven Explainable Clustering via Language Descriptions [50.980832345025334]
We propose a new task formulation, "Goal-Driven Clustering with Explanations" (GoalEx) GoalEx represents both the goal and the explanations as free-form language descriptions. Our method produces more accurate and goal-related explanations than prior methods.
arXiv Detail & Related papers (2023-05-23T07:05:50Z)
Towards Practical Explainability with Cluster Descriptors [3.899688920770429]
We study the problem of making the clusters more explainable by investigating the cluster descriptors. The goal is to find a representative set of tags for each cluster, referred to as the cluster descriptors, with the constraint that these descriptors are pairwise disjoint. We propose a novel explainability model that reinforces the previous models in such a way that tags that do not contribute to explainability and do not sufficiently distinguish between clusters are not added to the optimal descriptors.
arXiv Detail & Related papers (2022-10-18T01:53:43Z)
Cluster Explanation via Polyhedral Descriptions [0.0]
Clustering is an unsupervised learning problem that aims to partition unlabelled data points into groups with similar features. Traditional clustering algorithms provide limited insight into the groups they find as their main focus is accuracy and not the interpretability of the group assignments. We introduce a new approach to explain clusters by constructing polyhedra around each cluster while minimizing either the complexity of the resulting polyhedra or the number of features used in the description.
arXiv Detail & Related papers (2022-10-17T07:26:44Z)
Self-supervised Contrastive Attributed Graph Clustering [110.52694943592974]
We propose a novel attributed graph clustering network, namely Self-supervised Contrastive Attributed Graph Clustering (SCAGC) In SCAGC, by leveraging inaccurate clustering labels, a self-supervised contrastive loss, are designed for node representation learning. For the OOS nodes, SCAGC can directly calculate their clustering labels.
arXiv Detail & Related papers (2021-10-15T03:25:28Z)
You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation. We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one. By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z)
Graph Contrastive Clustering [131.67881457114316]
We propose a novel graph contrastive learning framework, which is then applied to the clustering task and we come up with the Graph Constrastive Clustering(GCC) method. Specifically, on the one hand, the graph Laplacian based contrastive loss is proposed to learn more discriminative and clustering-friendly features. On the other hand, a novel graph-based contrastive learning strategy is proposed to learn more compact clustering assignments.
arXiv Detail & Related papers (2021-04-03T15:32:49Z)
Structured Graph Learning for Clustering and Semi-supervised Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data. Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure. Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.