Estimating Higher-Order Mixed Memberships via the $\ell_{2,\infty}$
Tensor Perturbation Bound
- URL: http://arxiv.org/abs/2212.08642v3
- Date: Thu, 1 Feb 2024 15:37:11 GMT
- Title: Estimating Higher-Order Mixed Memberships via the $\ell_{2,\infty}$
Tensor Perturbation Bound
- Authors: Joshua Agterberg and Anru Zhang
- Abstract summary: We propose the tensor mixed-membership blockmodel, a generalization of the tensor blockmodel.
We establish the identifiability of our model and propose a computationally efficient estimation procedure.
We apply our methodology to real and simulated data, demonstrating some effects not identifiable from the model with discrete community memberships.
- Score: 8.521132000449766
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Higher-order multiway data is ubiquitous in machine learning and statistics
and often exhibits community-like structures, where each component (node) along
each different mode has a community membership associated with it. In this
paper we propose the tensor mixed-membership blockmodel, a generalization of
the tensor blockmodel positing that memberships need not be discrete, but
instead are convex combinations of latent communities. We establish the
identifiability of our model and propose a computationally efficient estimation
procedure based on the higher-order orthogonal iteration algorithm (HOOI) for
tensor SVD composed with a simplex corner-finding algorithm. We then
demonstrate the consistency of our estimation procedure by providing a per-node
error bound, which showcases the effect of higher-order structures on
estimation accuracy. To prove our consistency result, we develop the
$\ell_{2,\infty}$ tensor perturbation bound for HOOI under independent,
heteroskedastic, subgaussian noise that may be of independent interest. Our
analysis uses a novel leave-one-out construction for the iterates, and our
bounds depend only on spectral properties of the underlying low-rank tensor
under nearly optimal signal-to-noise ratio conditions such that tensor SVD is
computationally feasible. Finally, we apply our methodology to real and
simulated data, demonstrating some effects not identifiable from the model with
discrete community memberships.
Related papers
- Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise.
In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z) - Subspace clustering in high-dimensions: Phase transitions \&
Statistical-to-Computational gap [24.073221004661427]
A simple model to study subspace clustering is the high-dimensional $k$-Gaussian mixture model.
We provide an exact characterization of the statistically optimal reconstruction error in this model in the high-dimensional regime with extensive sparsity.
arXiv Detail & Related papers (2022-05-26T17:47:35Z) - Nonparametric Sparse Tensor Factorization with Hierarchical Gamma
Processes [16.79618682556073]
We propose a nonparametric factorization approach for sparsely observed tensors.
We use hierarchical Gamma processes and Poisson random measures to construct a tensor-valued process.
For efficient inference, we use Dirichlet process properties over finite sample partitions, density transformations, and random features.
arXiv Detail & Related papers (2021-10-19T16:17:26Z) - Lattice partition recovery with dyadic CART [79.96359947166592]
We study piece-wise constant signals corrupted by additive Gaussian noise over a $d$-dimensional lattice.
Data of this form naturally arise in a host of applications, and the tasks of signal detection or testing, de-noising and estimation have been studied extensively in the statistical and signal processing literature.
In this paper we consider instead the problem of partition recovery, i.e.of estimating the partition of the lattice induced by the constancy regions of the unknown signal.
We prove that a DCART-based procedure consistently estimates the underlying partition at a rate of order $sigma2 k*
arXiv Detail & Related papers (2021-05-27T23:41:01Z) - A Bayesian Approach to Block-Term Tensor Decomposition Model Selection
and Computation [10.91885508254207]
The so-called block-term decomposition (BTD) tensor model, especially in its rank-$(L_r,L_r,1)$ version, has been recently receiving increasing attention.
The challenge of estimating the BTD model structure, namely the number of block terms and their individual ranks, has only recently started to attract significant attention.
A Bayesian approach is taken to addressing the problem of rank-$(L_r,L_r,1)$ BTD model selection and computation, based on the idea of imposing column sparsity emphjointly on the
arXiv Detail & Related papers (2021-01-08T09:37:21Z) - Stochastic Approximation for Online Tensorial Independent Component
Analysis [98.34292831923335]
Independent component analysis (ICA) has been a popular dimension reduction tool in statistical machine learning and signal processing.
In this paper, we present a by-product online tensorial algorithm that estimates for each independent component.
arXiv Detail & Related papers (2020-12-28T18:52:37Z) - Exact Clustering in Tensor Block Model: Statistical Optimality and
Computational Limit [10.8145995157397]
High-order clustering aims to identify heterogeneous substructure in multiway dataset.
Non- computation and nature of the problem poses significant challenges in both statistics and statistics.
arXiv Detail & Related papers (2020-12-18T00:48:27Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Probabilistic Circuits for Variational Inference in Discrete Graphical
Models [101.28528515775842]
Inference in discrete graphical models with variational methods is difficult.
Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO)
We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN)
We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
arXiv Detail & Related papers (2020-10-22T05:04:38Z) - Multi-View Spectral Clustering Tailored Tensor Low-Rank Representation [105.33409035876691]
This paper explores the problem of multi-view spectral clustering (MVSC) based on tensor low-rank modeling.
We design a novel structured tensor low-rank norm tailored to MVSC.
We show that the proposed method outperforms state-of-the-art methods to a significant extent.
arXiv Detail & Related papers (2020-04-30T11:52:12Z) - A Unified Framework for Coupled Tensor Completion [42.19293115131073]
Coupled tensor decomposition reveals the joint data structure by incorporating priori knowledge that come from the latent coupled factors.
The TR has powerful expression ability and achieves success in some multi-dimensional data processing applications.
The proposed method is validated on numerical experiments on synthetic data, and experimental results on real-world data demonstrate its superiority over the state-of-the-art methods in terms of recovery accuracy.
arXiv Detail & Related papers (2020-01-09T02:15:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.