SGA: A Robust Algorithm for Partial Recovery of Tree-Structured
Graphical Models with Noisy Samples
- URL: http://arxiv.org/abs/2101.08917v1
- Date: Fri, 22 Jan 2021 01:57:35 GMT
- Title: SGA: A Robust Algorithm for Partial Recovery of Tree-Structured
Graphical Models with Noisy Samples
- Authors: Anshoo Tandon, Aldric H. J. Yuan, Vincent Y. F. Tan
- Abstract summary: We consider learning Ising tree models when the observations from the nodes are corrupted by independent but non-identically distributed noise.
Katiyar et al. (2020) showed that although the exact tree structure cannot be recovered, one can recover a partial tree structure.
We propose Symmetrized Geometric Averaging (SGA), a more statistically robust algorithm for partial tree recovery.
- Score: 75.32013242448151
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider learning Ising tree models when the observations from the nodes
are corrupted by independent but non-identically distributed noise with unknown
statistics. Katiyar et al. (2020) showed that although the exact tree structure
cannot be recovered, one can recover a partial tree structure; that is, a
structure belonging to the equivalence class containing the true tree. This
paper presents a systematic improvement of Katiyar et al. (2020). First, we
present a novel impossibility result by deriving a bound on the necessary
number of samples for partial recovery. Second, we derive a significantly
improved sample complexity result in which the dependence on the minimum
correlation $\rho_{\min}$ is $\rho_{\min}^{-8}$ instead of $\rho_{\min}^{-24}$.
Finally, we propose Symmetrized Geometric Averaging (SGA), a more statistically
robust algorithm for partial tree recovery. We provide error exponent analyses
and extensive numerical results on a variety of trees to show that the sample
complexity of SGA is significantly better than the algorithm of Katiyar et al.
(2020). SGA can be readily extended to Gaussian models and is shown via
numerical experiments to be similarly superior.
Related papers
- Sharded Bayesian Additive Regression Trees [1.4213973379473654]
We introduce a randomization auxiliary variable and a sharding tree to decide partitioning of data.
By observing that the optimal design of a sharding tree can determine optimal sharding for sub-models on a product space, we introduce an intersection tree structure to completely specify both the sharding and modeling using only tree structures.
arXiv Detail & Related papers (2023-06-01T05:41:31Z) - Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure.
We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance.
We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z) - Active-LATHE: An Active Learning Algorithm for Boosting the Error
Exponent for Learning Homogeneous Ising Trees [75.93186954061943]
We design and analyze an algorithm that boosts the error exponent by at least 40% when $rho$ is at least $0.8$.
Our analysis hinges on judiciously exploiting the minute but detectable statistical variation of the samples to allocate more data to parts of the graph.
arXiv Detail & Related papers (2021-10-27T10:45:21Z) - Decentralized Learning of Tree-Structured Gaussian Graphical Models from
Noisy Data [1.52292571922932]
This paper studies the decentralized learning of tree-structured Gaussian graphical models (GGMs) from noisy data.
GGMs are widely used to model complex networks such as gene regulatory networks and social networks.
The proposed decentralized learning uses the Chow-Liu algorithm for estimating the tree-structured GGM.
arXiv Detail & Related papers (2021-09-22T10:41:18Z) - Robustifying Algorithms of Learning Latent Trees with Vector Variables [92.18777020401484]
We present the sample complexities of Recursive Grouping (RG) and Chow-Liu Recursive Grouping (CLRG)
We robustify RG, CLRG, Neighbor Joining (NJ) and Spectral NJ (SNJ) by using the truncated inner product.
We derive the first known instance-dependent impossibility result for structure learning of latent trees.
arXiv Detail & Related papers (2021-06-02T01:37:52Z) - Spectral Top-Down Recovery of Latent Tree Models [13.681975313065477]
Spectral Top-Down Recovery (STDR) is a divide-and-conquer approach for inference of large latent tree models.
STDR's partitioning step is non-random. Instead, it is based on the Fiedler vector of a suitable Laplacian matrix related to the observed nodes.
We prove that STDR is statistically consistent, and bound the number of samples required to accurately recover the tree with high probability.
arXiv Detail & Related papers (2021-02-26T02:47:42Z) - Visualizing hierarchies in scRNA-seq data using a density tree-biased
autoencoder [50.591267188664666]
We propose an approach for identifying a meaningful tree structure from high-dimensional scRNA-seq data.
We then introduce DTAE, a tree-biased autoencoder that emphasizes the tree structure of the data in low dimensional space.
arXiv Detail & Related papers (2021-02-11T08:48:48Z) - Exact Asymptotics for Learning Tree-Structured Graphical Models with
Side Information: Noiseless and Noisy Samples [75.97774982432976]
We derive the exacts of learning an Ising tree-structured graphical model from independently drawn samples.
We extend our results to the scenario in which the samples are observed in random noise.
Our theoretical results demonstrate keen agreement with experimental results for sample sizes as small as that in the hundreds.
arXiv Detail & Related papers (2020-05-09T02:42:40Z) - Spectral neighbor joining for reconstruction of latent tree models [5.229354894035374]
We develop Spectral Neighbor Joining, a novel method to recover the structure of latent tree graphical models.
We prove that SNJ is consistent, and derive a sufficient condition for correct tree recovery from an estimated similarity matrix.
We illustrate via extensive simulations that in comparison to several other reconstruction methods, SNJ requires fewer samples to accurately recover trees with a large number of leaves or long edges.
arXiv Detail & Related papers (2020-02-28T05:13:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.