Related papers: Robust estimation of tree structured models

Robust estimation of tree structured models

URL: http://arxiv.org/abs/2102.05472v1
Date: Wed, 10 Feb 2021 14:58:40 GMT
Title: Robust estimation of tree structured models
Authors: Marta Casanellas, Marina Garrote-L\'opez and Piotr Zwiernik
Abstract summary: We show that it is possible to recover trees from noisy binary data up to a small equivalence class of possible trees. We also provide a characterisation of when the Chow-Liu algorithm consistently learns the underlying tree from the noisy data.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Consider the problem of learning undirected graphical models on trees from corrupted data. Recently Katiyar et al. showed that it is possible to recover trees from noisy binary data up to a small equivalence class of possible trees. Their other paper on the Gaussian case follows a similar pattern. By framing this as a special phylogenetic recovery problem we largely generalize these two settings. Using the framework of linear latent tree models we discuss tree identifiability for binary data under a continuous corruption model. For the Ising and the Gaussian tree model we also provide a characterisation of when the Chow-Liu algorithm consistently learns the underlying tree from the noisy data.

Related papers

Experiments with Optimal Model Trees [2.8391355909797644]
We show that globally optimal model trees can achieve competitive accuracy with very small trees. We also compare to classic optimal and greedily grown decision trees, random forests, and support vector machines.
arXiv Detail & Related papers (2025-03-17T08:03:47Z)
Learning Decision Trees as Amortized Structure Inference [59.65621207449269]
We propose a hybrid amortized structure inference approach to learn predictive decision tree ensembles given data. We show that our approach, DT-GFN, outperforms state-of-the-art decision tree and deep learning methods on standard classification benchmarks.
arXiv Detail & Related papers (2025-03-10T07:05:07Z)
Decision Trees for Interpretable Clusters in Mixture Models and Deep Representations [5.65604054654671]
We introduce the notion of an explainability-to-noise ratio for mixture models. We propose an algorithm that takes as input a mixture model and constructs a suitable tree in data-independent time. We prove upper and lower bounds on the error rate of the resulting decision tree.
arXiv Detail & Related papers (2024-11-03T14:00:20Z)
Learning Staged Trees from Incomplete Data [1.6327794667678908]
We introduce the first algorithms for staged trees that handle missingness within the learning of the model. A computational experiment showcases the performance of the novel learning algorithms.
arXiv Detail & Related papers (2024-05-28T16:00:23Z)
Forecasting with Hyper-Trees [50.72190208487953]
Hyper-Trees are designed to learn the parameters of time series models. By relating the parameters of a target time series model to features, Hyper-Trees also address the issue of parameter non-stationarity. In this novel approach, the trees first generate informative representations from the input features, which a shallow network then maps to the target model parameters.
arXiv Detail & Related papers (2024-05-13T15:22:15Z)
Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure. We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance. We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z)
Active-LATHE: An Active Learning Algorithm for Boosting the Error Exponent for Learning Homogeneous Ising Trees [75.93186954061943]
We design and analyze an algorithm that boosts the error exponent by at least 40% when $rho$ is at least $0.8$. Our analysis hinges on judiciously exploiting the minute but detectable statistical variation of the samples to allocate more data to parts of the graph.
arXiv Detail & Related papers (2021-10-27T10:45:21Z)
Spectral Top-Down Recovery of Latent Tree Models [13.681975313065477]
Spectral Top-Down Recovery (STDR) is a divide-and-conquer approach for inference of large latent tree models. STDR's partitioning step is non-random. Instead, it is based on the Fiedler vector of a suitable Laplacian matrix related to the observed nodes. We prove that STDR is statistically consistent, and bound the number of samples required to accurately recover the tree with high probability.
arXiv Detail & Related papers (2021-02-26T02:47:42Z)
Visualizing hierarchies in scRNA-seq data using a density tree-biased autoencoder [50.591267188664666]
We propose an approach for identifying a meaningful tree structure from high-dimensional scRNA-seq data. We then introduce DTAE, a tree-biased autoencoder that emphasizes the tree structure of the data in low dimensional space.
arXiv Detail & Related papers (2021-02-11T08:48:48Z)
SGA: A Robust Algorithm for Partial Recovery of Tree-Structured Graphical Models with Noisy Samples [75.32013242448151]
We consider learning Ising tree models when the observations from the nodes are corrupted by independent but non-identically distributed noise. Katiyar et al. (2020) showed that although the exact tree structure cannot be recovered, one can recover a partial tree structure. We propose Symmetrized Geometric Averaging (SGA), a more statistically robust algorithm for partial tree recovery.
arXiv Detail & Related papers (2021-01-22T01:57:35Z)
Growing Deep Forests Efficiently with Soft Routing and Learned Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects. We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions. Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z)
Tensor Decompositions in Recursive Neural Networks for Tree-Structured Data [12.069862650316262]
We introduce two new aggregation functions to encode structural knowledge from tree-structured data. We test them on two tree classification tasks, showing the advantage of proposed models when tree outdegree increases.
arXiv Detail & Related papers (2020-06-18T15:40:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.