Hierarchical Linkage Clustering Beyond Binary Trees and Ultrametrics
- URL: http://arxiv.org/abs/2511.18056v1
- Date: Sat, 22 Nov 2025 13:20:34 GMT
- Title: Hierarchical Linkage Clustering Beyond Binary Trees and Ultrametrics
- Authors: Maximilien Dreveton, Matthias Grossglauser, Daichi Kuroda, Patrick Thiran,
- Abstract summary: This paper introduces the notion of a valid hierarchy and defines a partial order over the set of valid hierarchies.<n>We prove the existence of a finest valid hierarchy, that is, the hierarchy that encodes the maximum information consistent with the similarity structure of the data set.<n>We propose a simple two-step algorithm that first constructs a binary tree via a linkage method and then prunes it to enforce validity.
- Score: 11.198418635553976
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hierarchical clustering seeks to uncover nested structures in data by constructing a tree of clusters, where deeper levels reveal finer-grained relationships. Traditional methods, including linkage approaches, face three major limitations: (i) they always return a hierarchy, even if none exists, (ii) they are restricted to binary trees, even if the true hierarchy is non-binary, and (iii) they are highly sensitive to the choice of linkage function. In this paper, we address these issues by introducing the notion of a valid hierarchy and defining a partial order over the set of valid hierarchies. We prove the existence of a finest valid hierarchy, that is, the hierarchy that encodes the maximum information consistent with the similarity structure of the data set. In particular, the finest valid hierarchy is not constrained to binary structures and, when no hierarchical relationships exist, collapses to a star tree. We propose a simple two-step algorithm that first constructs a binary tree via a linkage method and then prunes it to enforce validity. We establish necessary and sufficient conditions on the linkage function under which this procedure exactly recovers the finest valid hierarchy, and we show that all linkage functions satisfying these conditions yield the same hierarchy after pruning. Notably, classical linkage rules such as single, complete, and average satisfy these conditions, whereas Ward's linkage fails to do so.
Related papers
- Learning Order Forest for Qualitative-Attribute Data Clustering [52.612779710298526]
This paper discovers a tree-like distance structure to flexibly represent the local order relationship among intra-attribute qualitative values.<n>A joint learning mechanism is proposed to iteratively obtain more appropriate tree structures and clusters.<n>Experiments demonstrate that the joint learning adapts the forest to the clustering task to yield accurate results.
arXiv Detail & Related papers (2026-03-03T07:49:50Z) - Provable Learning of Random Hierarchy Models and Hierarchical Shallow-to-Deep Chaining [58.69016084278948]
We consider a hierarchical context-free grammar introduced by arXiv:2307.02129 and conjectured to separate deep and shallow networks.<n>We prove that, under mild conditions, a deep convolutional network can be efficiently trained to learn this function class.
arXiv Detail & Related papers (2026-01-27T16:19:54Z) - Hierarchical Transformers for Unsupervised 3D Shape Abstraction [29.488408094629545]
We introduce HiT, a novel hierarchical neural field representation for 3D shapes.<n>HiT learns general hierarchies in a coarse-to-fine manner across different shape categories in an unsupervised setting.<n>We demonstrate its effectiveness through an unsupervised shape segmentation task over all 55 ShapeNet categories.
arXiv Detail & Related papers (2025-10-31T01:19:05Z) - The Geometry of Meaning: Perfect Spacetime Representations of Hierarchical Structures [0.0]
We show that there is a fast algorithm that embeds hierarchical structures in three-dimensional Minkowski spacetime.<n>Our results seem to indicate that all discrete data has a perfect geometrical representation that is three-dimensional.
arXiv Detail & Related papers (2025-05-07T20:41:06Z) - ReTreever: Tree-based Coarse-to-Fine Representations for Retrieval [64.44265315244579]
We propose a tree-based method for organizing and representing reference documents at various granular levels.<n>Our method, called ReTreever, jointly learns a routing function per internal node of a binary tree such that query and reference documents are assigned to similar tree branches.<n>Our evaluations show that ReTreever generally preserves full representation accuracy.
arXiv Detail & Related papers (2025-02-11T21:35:13Z) - Integrating Hierarchical Semantic into Iterative Generation Model for Entailment Tree Explanation [7.5496857647335585]
We propose an architecture of integrating the Hierarchical Semantics of sentences under the framework of Controller-Generator (HiSCG) to explain answers.
The proposed method achieves comparable performance on all three settings of the EntailmentBank dataset.
arXiv Detail & Related papers (2024-09-26T11:46:58Z) - When Does Bottom-up Beat Top-down in Hierarchical Community Detection? [8.097200145973389]
Hierarchical clustering of networks consists in finding a tree of communities, such that lower levels of the hierarchy reveal finer-grained community structures.
Agglomerative ($textitbottom-up$) algorithms first identify the smallest community structure and then repeatedly merge the communities using a $textitlinkage$ method.
We establish theoretical guarantees for the recovery of the hierarchical tree and community structure of a Hierarchical Block Model by a bottom-up algorithm.
arXiv Detail & Related papers (2023-06-01T15:55:46Z) - Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure.
We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance.
We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z) - Use All The Labels: A Hierarchical Multi-Label Contrastive Learning
Framework [75.79736930414715]
We present a hierarchical multi-label representation learning framework that can leverage all available labels and preserve the hierarchical relationship between classes.
We introduce novel hierarchy preserving losses, which jointly apply a hierarchical penalty to the contrastive loss, and enforce the hierarchy constraint.
arXiv Detail & Related papers (2022-04-27T21:41:44Z) - Modeling Heterogeneous Hierarchies with Relation-specific Hyperbolic
Cones [64.75766944882389]
We present ConE (Cone Embedding), a KG embedding model that is able to simultaneously model multiple hierarchical as well as non-hierarchical relations in a knowledge graph.
In particular, ConE uses cone containment constraints in different subspaces of the hyperbolic embedding space to capture multiple heterogeneous hierarchies.
Our approach yields new state-of-the-art Hits@1 of 45.3% on WN18RR and 16.1% on DDB14 (0.231 MRR)
arXiv Detail & Related papers (2021-10-28T07:16:08Z) - Inducing a hierarchy for multi-class classification problems [11.58041597483471]
In applications where categorical labels follow a natural hierarchy, classification methods that exploit the label structure often outperform those that do not.
In this paper, we investigate a class of methods that induce a hierarchy that can similarly improve classification performance over flat classifiers.
We demonstrate the effectiveness of the class of methods both for discovering a latent hierarchy and for improving accuracy in principled simulation settings and three real data applications.
arXiv Detail & Related papers (2021-02-20T05:40:42Z) - Hierarchical community structure in networks [1.9981375888949475]
We present a theoretical study on hierarchical community structure in networks.
We address the following questions: 1) How should we define a hierarchy of communities? 2) How do we determine if there is sufficient evidence of a hierarchical structure in a network?
arXiv Detail & Related papers (2020-09-15T16:18:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.