From Logits to Hierarchies: Hierarchical Clustering made Simple
- URL: http://arxiv.org/abs/2410.07858v1
- Date: Thu, 10 Oct 2024 12:27:45 GMT
- Title: From Logits to Hierarchies: Hierarchical Clustering made Simple
- Authors: Emanuele Palumbo, Moritz Vandenhirtz, Alain Ryser, Imant Daunhawer, Julia E. Vogt,
- Abstract summary: We show that a lightweight procedure implemented on top of pre-trained non-hierarchical clustering models outperforms models designed specifically for hierarchical clustering.
Our proposed approach is computationally efficient and applicable to any pre-trained clustering model that outputs logits, without requiring any fine-tuning.
- Score: 16.132657141993548
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The structure of many real-world datasets is intrinsically hierarchical, making the modeling of such hierarchies a critical objective in both unsupervised and supervised machine learning. Recently, novel approaches for hierarchical clustering with deep architectures have been proposed. In this work, we take a critical perspective on this line of research and demonstrate that many approaches exhibit major limitations when applied to realistic datasets, partly due to their high computational complexity. In particular, we show that a lightweight procedure implemented on top of pre-trained non-hierarchical clustering models outperforms models designed specifically for hierarchical clustering. Our proposed approach is computationally efficient and applicable to any pre-trained clustering model that outputs logits, without requiring any fine-tuning. To highlight the generality of our findings, we illustrate how our method can also be applied in a supervised setup, recovering meaningful hierarchies from a pre-trained ImageNet classifier.
Related papers
- Exploiting Data Hierarchy as a New Modality for Contrastive Learning [0.0]
This work investigates how hierarchically structured data can help neural networks learn conceptual representations of cathedrals.
The underlying WikiScenes dataset provides a spatially organized hierarchical structure of cathedral components.
We propose a novel hierarchical contrastive training approach that leverages a triplet margin loss to represent the data's spatial hierarchy in the encoder's latent space.
arXiv Detail & Related papers (2024-01-06T21:47:49Z) - Efficient Multi-View Graph Clustering with Local and Global Structure
Preservation [59.49018175496533]
We propose a novel anchor-based multi-view graph clustering framework termed Efficient Multi-View Graph Clustering with Local and Global Structure Preservation (EMVGC-LG)
Specifically, EMVGC-LG jointly optimize anchor construction and graph learning to enhance the clustering quality.
In addition, EMVGC-LG inherits the linear complexity of existing AMVGC methods respecting the sample number.
arXiv Detail & Related papers (2023-08-31T12:12:30Z) - Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure.
We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance.
We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z) - Unified Multi-View Orthonormal Non-Negative Graph Based Clustering
Framework [74.25493157757943]
We formulate a novel clustering model, which exploits the non-negative feature property and incorporates the multi-view information into a unified joint learning framework.
We also explore, for the first time, the multi-model non-negative graph-based approach to clustering data based on deep features.
arXiv Detail & Related papers (2022-11-03T08:18:27Z) - Generating Hierarchical Explanations on Text Classification Without
Connecting Rules [14.624434065904232]
We argue that the connecting rule as an additional prior may undermine the ability to reflect the model decision process faithfully.
We propose to generate hierarchical explanations without the connecting rule and introduce a framework for generating hierarchical clusters.
arXiv Detail & Related papers (2022-10-24T14:11:23Z) - ExpertNet: A Symbiosis of Classification and Clustering [22.324813752423044]
ExpertNet uses novel training strategies to learn clustered latent representations and leverage them by effectively combining cluster-specific classifiers.
We demonstrate the superiority of ExpertNet over state-of-the-art methods on 6 large clinical datasets.
arXiv Detail & Related papers (2022-01-17T11:00:30Z) - Learning the Precise Feature for Cluster Assignment [39.320210567860485]
We propose a framework which integrates representation learning and clustering into a single pipeline for the first time.
The proposed framework exploits the powerful ability of recently developed generative models for learning intrinsic features.
Experimental results show that the performance of the proposed method is superior, or at least comparable to, the state-of-the-art methods.
arXiv Detail & Related papers (2021-06-11T04:08:54Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Structured Graph Learning for Clustering and Semi-supervised
Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data.
Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure.
Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z) - Leveraging Class Hierarchies with Metric-Guided Prototype Learning [5.070542698701158]
In many classification tasks, the set of target classes can be organized into a hierarchy.
This structure induces a semantic distance between classes, and can be summarised under the form of a cost matrix.
We propose to model the hierarchical class structure by integrating this metric in the supervision of a prototypical network.
arXiv Detail & Related papers (2020-07-06T20:22:08Z) - Fair Hierarchical Clustering [92.03780518164108]
We define a notion of fairness that mitigates over-representation in traditional clustering.
We show that our algorithms can find a fair hierarchical clustering, with only a negligible loss in the objective.
arXiv Detail & Related papers (2020-06-18T01:05:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.