Deep Descriptive Clustering
- URL: http://arxiv.org/abs/2105.11549v1
- Date: Mon, 24 May 2021 21:40:16 GMT
- Title: Deep Descriptive Clustering
- Authors: Hongjing Zhang, Ian Davidson
- Abstract summary: This paper explores a novel setting for performing clustering on complex data while simultaneously generating explanations using interpretable tags.
We form good clusters by maximizing the mutual information between empirical distribution on the inputs and the induced clustering labels for clustering objectives.
Experimental results on public data demonstrate that our model outperforms competitive baselines in clustering performance.
- Score: 24.237000220172906
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent work on explainable clustering allows describing clusters when the
features are interpretable. However, much modern machine learning focuses on
complex data such as images, text, and graphs where deep learning is used but
the raw features of data are not interpretable. This paper explores a novel
setting for performing clustering on complex data while simultaneously
generating explanations using interpretable tags. We propose deep descriptive
clustering that performs sub-symbolic representation learning on complex data
while generating explanations based on symbolic data. We form good clusters by
maximizing the mutual information between empirical distribution on the inputs
and the induced clustering labels for clustering objectives. We generate
explanations by solving an integer linear programming that generates concise
and orthogonal descriptions for each cluster. Finally, we allow the explanation
to inform better clustering by proposing a novel pairwise loss with
self-generated constraints to maximize the clustering and explanation module's
consistency. Experimental results on public data demonstrate that our model
outperforms competitive baselines in clustering performance while offering
high-quality cluster-level explanations.
Related papers
- NeurCAM: Interpretable Neural Clustering via Additive Models [3.4437947384641037]
Interpretable clustering algorithms aim to group similar data points while explaining the obtained groups.
We introduce the Neural Clustering Additive Model (NeurCAM), a novel approach to the interpretable clustering problem.
Our approach significantly outperforms other interpretable clustering approaches when clustering on text data.
arXiv Detail & Related papers (2024-08-23T20:32:57Z) - Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering.
In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework.
In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z) - Using Decision Trees for Interpretable Supervised Clustering [0.0]
supervised clustering aims at forming clusters of labelled data with high probability densities.
We are particularly interested in finding clusters of data of a given class and describing the clusters with the set of comprehensive rules.
arXiv Detail & Related papers (2023-07-16T17:12:45Z) - Interpretable Deep Clustering for Tabular Data [7.972599673048582]
Clustering is a fundamental learning task widely used in data analysis.
We propose a new deep-learning framework that predicts interpretable cluster assignments at the instance and cluster levels.
We show that the proposed method can reliably predict cluster assignments in biological, text, image, and physics datasets.
arXiv Detail & Related papers (2023-06-07T21:08:09Z) - Goal-Driven Explainable Clustering via Language Descriptions [50.980832345025334]
We propose a new task formulation, "Goal-Driven Clustering with Explanations" (GoalEx)
GoalEx represents both the goal and the explanations as free-form language descriptions.
Our method produces more accurate and goal-related explanations than prior methods.
arXiv Detail & Related papers (2023-05-23T07:05:50Z) - Towards Practical Explainability with Cluster Descriptors [3.899688920770429]
We study the problem of making the clusters more explainable by investigating the cluster descriptors.
The goal is to find a representative set of tags for each cluster, referred to as the cluster descriptors, with the constraint that these descriptors are pairwise disjoint.
We propose a novel explainability model that reinforces the previous models in such a way that tags that do not contribute to explainability and do not sufficiently distinguish between clusters are not added to the optimal descriptors.
arXiv Detail & Related papers (2022-10-18T01:53:43Z) - Cluster Explanation via Polyhedral Descriptions [0.0]
Clustering is an unsupervised learning problem that aims to partition unlabelled data points into groups with similar features.
Traditional clustering algorithms provide limited insight into the groups they find as their main focus is accuracy and not the interpretability of the group assignments.
We introduce a new approach to explain clusters by constructing polyhedra around each cluster while minimizing either the complexity of the resulting polyhedra or the number of features used in the description.
arXiv Detail & Related papers (2022-10-17T07:26:44Z) - Self-supervised Contrastive Attributed Graph Clustering [110.52694943592974]
We propose a novel attributed graph clustering network, namely Self-supervised Contrastive Attributed Graph Clustering (SCAGC)
In SCAGC, by leveraging inaccurate clustering labels, a self-supervised contrastive loss, are designed for node representation learning.
For the OOS nodes, SCAGC can directly calculate their clustering labels.
arXiv Detail & Related papers (2021-10-15T03:25:28Z) - You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation.
We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one.
By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z) - Graph Contrastive Clustering [131.67881457114316]
We propose a novel graph contrastive learning framework, which is then applied to the clustering task and we come up with the Graph Constrastive Clustering(GCC) method.
Specifically, on the one hand, the graph Laplacian based contrastive loss is proposed to learn more discriminative and clustering-friendly features.
On the other hand, a novel graph-based contrastive learning strategy is proposed to learn more compact clustering assignments.
arXiv Detail & Related papers (2021-04-03T15:32:49Z) - Structured Graph Learning for Clustering and Semi-supervised
Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data.
Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure.
Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.