Related papers: Hierarchical clustering: visualization, feature importance and model selection

Hierarchical clustering: visualization, feature importance and model selection

URL: http://arxiv.org/abs/2112.01372v1
Date: Tue, 30 Nov 2021 20:38:17 GMT
Title: Hierarchical clustering: visualization, feature importance and model selection
Authors: Luben M. C. Cabezas, Rafael Izbicki, Rafael B. Stern
Abstract summary: We propose methods for the analysis of hierarchical clustering that fully use the multi-resolution structure provided by a dendrogram. The key insight behind the proposed methods is to view a dendrogram as a phylogeny. Real and simulated datasets provide evidence that our proposed framework has desirable outcomes.
Score: 4.017760528208122
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose methods for the analysis of hierarchical clustering that fully use the multi-resolution structure provided by a dendrogram. Specifically, we propose a loss for choosing between clustering methods, a feature importance score and a graphical tool for visualizing the segmentation of features in a dendrogram. Current approaches to these tasks lead to loss of information since they require the user to generate a single partition of the instances by cutting the dendrogram at a specified level. Our proposed methods, instead, use the full structure of the dendrogram. The key insight behind the proposed methods is to view a dendrogram as a phylogeny. This analogy permits the assignment of a feature value to each internal node of a tree through ancestral state reconstruction. Real and simulated datasets provide evidence that our proposed framework has desirable outcomes. We provide an R package that implements our methods.

Related papers

Graph-based Semi-supervised and Unsupervised Methods for Local Clustering [0.0]
Local clustering aims to identify specific substructures within a large graph without requiring full knowledge of the entire graph. We first propose a method for identifying specific local clusters when very few labeled data is given, which we term semi-supervised local clustering. We then extend this approach to the unsupervised setting when no prior information on labels is available.
arXiv Detail & Related papers (2025-04-28T02:10:18Z)
ClusterGraph: a new tool for visualization and compression of multidimensional data [0.0]
This paper provides an additional layer on the output of any clustering algorithm. It provides information about the global layout of clusters, obtained from the considered clustering algorithm.
arXiv Detail & Related papers (2024-11-08T09:40:54Z)
Fast and Scalable Semi-Supervised Learning for Multi-View Subspace Clustering [13.638434337947302]
FSSMSC is a novel solution to the high computational complexity commonly found in existing approaches. The method generates a consensus anchor graph across all views, representing each data point as a sparse linear combination of chosen landmarks. The effectiveness and efficiency of FSSMSC are validated through extensive experiments on multiple benchmark datasets of varying scales.
arXiv Detail & Related papers (2024-08-11T06:54:00Z)
Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure. We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance. We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z)
GrannGAN: Graph annotation generative adversarial networks [72.66289932625742]
We consider the problem of modelling high-dimensional distributions and generating new examples of data with complex relational feature structure coherent with a graph skeleton. The model we propose tackles the problem of generating the data features constrained by the specific graph structure of each data point by splitting the task into two phases. In the first it models the distribution of features associated with the nodes of the given graph, in the second it complements the edge features conditionally on the node features.
arXiv Detail & Related papers (2022-12-01T11:49:07Z)
SHGNN: Structure-Aware Heterogeneous Graph Neural Network [77.78459918119536]
This paper proposes a novel Structure-Aware Heterogeneous Graph Neural Network (SHGNN) to address the above limitations. We first utilize a feature propagation module to capture the local structure information of intermediate nodes in the meta-path. Next, we use a tree-attention aggregator to incorporate the graph structure information into the aggregation module on the meta-path. Finally, we leverage a meta-path aggregator to fuse the information aggregated from different meta-paths.
arXiv Detail & Related papers (2021-12-12T14:18:18Z)
Effective and Efficient Graph Learning for Multi-view Clustering [173.8313827799077]
We propose an effective and efficient graph learning model for multi-view clustering. Our method exploits the view-similar between graphs of different views by the minimization of tensor Schatten p-norm. Our proposed algorithm is time-economical and obtains the stable results and scales well with the data size.
arXiv Detail & Related papers (2021-08-15T13:14:28Z)
Towards Clustering-friendly Representations: Subspace Clustering via Graph Filtering [16.60975509085194]
We propose a graph filtering approach by which a smooth representation is achieved. Experiments on image and document clustering datasets demonstrate that our method improves upon state-of-the-art subspace clustering techniques. An ablation study shows that graph filtering can remove noise, preserve structure in the image, and increase the separability of classes.
arXiv Detail & Related papers (2021-06-18T02:21:36Z)
Structured Graph Learning for Clustering and Semi-supervised Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data. Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure. Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z)
Graph Neural Networks with Composite Kernels [60.81504431653264]
We re-interpret node aggregation from the perspective of kernel weighting. We present a framework to consider feature similarity in an aggregation scheme. We propose feature aggregation as the composition of the original neighbor-based kernel and a learnable kernel to encode feature similarities in a feature space.
arXiv Detail & Related papers (2020-05-16T04:44:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.