Exemplars can Reciprocate Principal Components
- URL: http://arxiv.org/abs/2103.12069v1
- Date: Mon, 22 Mar 2021 12:46:29 GMT
- Title: Exemplars can Reciprocate Principal Components
- Authors: Kieran Greer
- Abstract summary: Category Trees is a clustering method that creates tree structures that branch on category type and not feature.
The theory is demonstrated using the Portugal Forest Fires dataset as a case study.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a clustering algorithm that is an extension of the
Category Trees algorithm. Category Trees is a clustering method that creates
tree structures that branch on category type and not feature. The development
in this paper is to consider a secondary order of clustering that is not the
category to which the data row belongs, but the tree, representing a single
classifier, that it is eventually clustered with. Each tree branches to store
subsets of other categories, but the rows in those subsets may also be related.
This paper is therefore concerned with looking at that second level of
clustering between the other category subsets, to try to determine if there is
any consistency over it. It is argued that Principal Components may be a
related and reciprocal type of structure, and there is an even bigger question
about the relation between exemplars and principal components, in general. The
theory is demonstrated using the Portugal Forest Fires dataset as a case study.
The distributed nature of that dataset can artificially create the tree
categories and the output criterion can also be determined in an automatic and
arbitrary way, leading to a flexible and dynamic clustering mechanism.
Related papers
- I Want 'Em All (At Once) -- Ultrametric Cluster Hierarchies [11.69714244591334]
We show that, for any reasonable hierarchy, one can optimally solve any center-based clustering objective over it.
We conclude by verifying the utility of our proposed techniques across datasets, hierarchies, and partitioning schemes.
arXiv Detail & Related papers (2025-02-19T18:03:52Z) - ReTreever: Tree-based Coarse-to-Fine Representations for Retrieval [64.44265315244579]
We propose a tree-based method for organizing and representing reference documents at various granular levels.
Our method, called ReTreever, jointly learns a routing function per internal node of a binary tree such that query and reference documents are assigned to similar tree branches.
Our evaluations show that ReTreever generally preserves full representation accuracy.
arXiv Detail & Related papers (2025-02-11T21:35:13Z) - ABCDE: Application-Based Cluster Diff Evals [49.1574468325115]
It aims to be practical: it allows items to have associated importance values that are application-specific, it is frugal in its use of human judgements when determining which clustering is better, and it can report metrics for arbitrary slices of items.
The approach to measuring the delta in the clustering quality is novel: instead of trying to construct an expensive ground truth up front and evaluating the each clustering with respect to that, ABCDE samples questions for judgement on the basis of the actual diffs between the clusterings.
arXiv Detail & Related papers (2024-07-31T08:29:35Z) - Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure.
We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance.
We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z) - Controlling the False Split Rate in Tree-Based Aggregation [11.226095593522691]
We propose a hypothesis testing algorithm for tree-based aggregation.
We focus on two main examples of tree-based aggregation, one which involves aggregating means and the other which involves aggregating regression coefficients.
arXiv Detail & Related papers (2021-08-11T17:59:22Z) - eTREE: Learning Tree-structured Embeddings [33.61635854505735]
Matrix factorization (MF) plays an important role in a wide range of machine learning and data mining models.
MF is commonly used to obtain item embeddings and feature representations.
We propose eTREE, a model that incorporates the tree structure to enhance the quality of the embeddings.
arXiv Detail & Related papers (2020-12-20T06:06:08Z) - Exploring the Hierarchy in Relation Labels for Scene Graph Generation [75.88758055269948]
The proposed method can improve several state-of-the-art baselines by a large margin (up to $33%$ relative gain) in terms of Recall@50.
Experiments show that the proposed simple yet effective method can improve several state-of-the-art baselines by a large margin.
arXiv Detail & Related papers (2020-09-12T17:36:53Z) - Structured Graph Learning for Clustering and Semi-supervised
Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data.
Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure.
Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z) - Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance
Segmentation [75.93960390191262]
We exploit prior knowledge of the relations among object categories to cluster fine-grained classes into coarser parent classes.
We propose a simple yet effective resampling method, NMS Resampling, to re-balance the data distribution.
Our method, termed as Forest R-CNN, can serve as a plug-and-play module being applied to most object recognition models.
arXiv Detail & Related papers (2020-08-13T03:52:37Z) - Deep Hierarchical Classification for Category Prediction in E-commerce
System [16.6932395109085]
In e-commerce system, category prediction is to automatically predict categories of given texts.
We propose a Deep Hierarchical Classification framework, which incorporates the multi-scale hierarchical information in neural networks.
We also define a novel combined loss function to punish hierarchical prediction losses.
arXiv Detail & Related papers (2020-05-14T02:29:14Z) - Tree Index: A New Cluster Evaluation Technique [2.790947019327459]
We introduce a cluster evaluation technique called Tree Index.
Our Tree Index is finding margins amongst clusters for easy learning without the complications of Minimum Description Length.
We show that, on the clustering results (obtained by various techniques) on a brain dataset, Tree Index discriminates between reasonable and non-sensible clusters.
arXiv Detail & Related papers (2020-03-24T13:41:12Z) - Scalable Hierarchical Clustering with Tree Grafting [66.68869706310208]
Grinch is a new algorithm for large-scale, non-greedy hierarchical clustering with general linkage functions.
Grinch is motivated by a new notion of separability for clustering with linkage functions.
arXiv Detail & Related papers (2019-12-31T20:56:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.