Related papers: eTREE: Learning Tree-structured Embeddings

eTREE: Learning Tree-structured Embeddings

URL: http://arxiv.org/abs/2012.10853v1
Date: Sun, 20 Dec 2020 06:06:08 GMT
Title: eTREE: Learning Tree-structured Embeddings
Authors: Faisal M. Almutairi, Yunlong Wang, Dong Wang, Emily Zhao, Nicholas D. Sidiropoulos
Abstract summary: Matrix factorization (MF) plays an important role in a wide range of machine learning and data mining models. MF is commonly used to obtain item embeddings and feature representations. We propose eTREE, a model that incorporates the tree structure to enhance the quality of the embeddings.
Score: 33.61635854505735
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Matrix factorization (MF) plays an important role in a wide range of machine learning and data mining models. MF is commonly used to obtain item embeddings and feature representations due to its ability to capture correlations and higher-order statistical dependencies across dimensions. In many applications, the categories of items exhibit a hierarchical tree structure. For instance, human diseases can be divided into coarse categories, e.g., bacterial, and viral. These categories can be further divided into finer categories, e.g., viral infections can be respiratory, gastrointestinal, and exanthematous viral diseases. In e-commerce, products, movies, books, etc., are grouped into hierarchical categories, e.g., clothing items are divided by gender, then by type (formal, casual, etc.). While the tree structure and the categories of the different items may be known in some applications, they have to be learned together with the embeddings in many others. In this work, we propose eTREE, a model that incorporates the (usually ignored) tree structure to enhance the quality of the embeddings. We leverage the special uniqueness properties of Nonnegative MF (NMF) to prove identifiability of eTREE. The proposed model not only exploits the tree structure prior, but also learns the hierarchical clustering in an unsupervised data-driven fashion. We derive an efficient algorithmic solution and a scalable implementation of eTREE that exploits parallel computing, computation caching, and warm start strategies. We showcase the effectiveness of eTREE on real data from various application domains: healthcare, recommender systems, and education. We also demonstrate the meaningfulness of the tree obtained from eTREE by means of domain experts interpretation.

Related papers

Tree-based variational inference for Poisson log-normal models [47.82745603191512]
hierarchical trees are often used to organize entities based on proximity criteria.<n>Current count-data models do not leverage this structured information.<n>We introduce the PLN-Tree model as an extension of the PLN model for modeling hierarchical count data.
arXiv Detail & Related papers (2024-06-25T08:24:35Z)
Beyond TreeSHAP: Efficient Computation of Any-Order Shapley Interactions for Tree Ensembles [6.664930499708017]
The Shapley value (SV) is a concept in explainable artificial intelligence (XAI) research for quantifying additive feature attributions of predictions. We present TreeSHAP-IQ, an efficient method to compute any-order additive Shapley interactions for predictions tree-based models.
arXiv Detail & Related papers (2024-01-22T16:08:41Z)
Effective and Efficient Federated Tree Learning on Hybrid Data [80.31870543351918]
We propose HybridTree, a novel federated learning approach that enables federated tree learning on hybrid data. We observe the existence of consistent split rules in trees and show that the knowledge of parties can be incorporated into the lower layers of a tree. Our experiments demonstrate that HybridTree can achieve comparable accuracy to the centralized setting with low computational and communication overhead.
arXiv Detail & Related papers (2023-10-18T10:28:29Z)
Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure. We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance. We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z)
Flexible Modeling and Multitask Learning using Differentiable Tree Ensembles [6.037383467521294]
We propose a flexible framework for learning tree ensembles to support arbitrary loss functions, missing responses, and multi-task learning. Our framework builds on differentiable tree ensembles, which can be trained using first-order methods. We show that our framework can lead to 100x more compact and 23% more expressive tree ensembles than those by popular toolkits.
arXiv Detail & Related papers (2022-05-19T17:30:49Z)
Learning Latent and Hierarchical Structures in Cognitive Diagnosis Models [3.4646560112467037]
A key component of Cognitive Diagnosis Models (CDMs) is a binary $Q$-matrix characterizing the dependence structure between the items and the latent attributes. This paper considers the problem of jointly learning these latent and hierarchical structures in CDMs from observed data. An efficient expectation-maximization algorithm and a latent structure recovery algorithm are developed.
arXiv Detail & Related papers (2021-04-05T20:33:02Z)
Exemplars can Reciprocate Principal Components [0.0]
Category Trees is a clustering method that creates tree structures that branch on category type and not feature. The theory is demonstrated using the Portugal Forest Fires dataset as a case study.
arXiv Detail & Related papers (2021-03-22T12:46:29Z)
Visualizing hierarchies in scRNA-seq data using a density tree-biased autoencoder [50.591267188664666]
We propose an approach for identifying a meaningful tree structure from high-dimensional scRNA-seq data. We then introduce DTAE, a tree-biased autoencoder that emphasizes the tree structure of the data in low dimensional space.
arXiv Detail & Related papers (2021-02-11T08:48:48Z)
Hierarchical Graph Capsule Network [78.4325268572233]
We propose hierarchical graph capsule network (HGCN) that can jointly learn node embeddings and extract graph hierarchies. To learn the hierarchical representation, HGCN characterizes the part-whole relationship between lower-level capsules (part) and higher-level capsules (whole)
arXiv Detail & Related papers (2020-12-16T04:13:26Z)
Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation [75.93960390191262]
We exploit prior knowledge of the relations among object categories to cluster fine-grained classes into coarser parent classes. We propose a simple yet effective resampling method, NMS Resampling, to re-balance the data distribution. Our method, termed as Forest R-CNN, can serve as a plug-and-play module being applied to most object recognition models.
arXiv Detail & Related papers (2020-08-13T03:52:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.