Multiway Spherical Clustering via Degree-Corrected Tensor Block Models
- URL: http://arxiv.org/abs/2201.07401v1
- Date: Wed, 19 Jan 2022 03:40:22 GMT
- Title: Multiway Spherical Clustering via Degree-Corrected Tensor Block Models
- Authors: Jiaxin Hu, Miaoyan Wang
- Abstract summary: We develop a degree-corrected block model with estimation accuracy guarantees.
In particular, we demonstrate that an intrinsic statistical-to-computational gap emerges only for tensors of order three or greater.
The efficacy of our procedure is demonstrated through two data applications.
- Score: 8.147652597876862
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We consider the problem of multiway clustering in the presence of unknown
degree heterogeneity. Such data problems arise commonly in applications such as
recommendation system, neuroimaging, community detection, and hypergraph
partitions in social networks. The allowance of degree heterogeneity provides
great flexibility in clustering models, but the extra complexity poses
significant challenges in both statistics and computation. Here, we develop a
degree-corrected tensor block model with estimation accuracy guarantees. We
present the phase transition of clustering performance based on the notion of
angle separability, and we characterize three signal-to-noise regimes
corresponding to different statistical-computational behaviors. In particular,
we demonstrate that an intrinsic statistical-to-computational gap emerges only
for tensors of order three or greater. Further, we develop an efficient
polynomial-time algorithm that provably achieves exact clustering under mild
signal conditions. The efficacy of our procedure is demonstrated through two
data applications, one on human brain connectome project, and another on Peru
Legislation network dataset.
Related papers
- The Risk of Federated Learning to Skew Fine-Tuning Features and
Underperform Out-of-Distribution Robustness [50.52507648690234]
Federated learning has the risk of skewing fine-tuning features and compromising the robustness of the model.
We introduce three robustness indicators and conduct experiments across diverse robust datasets.
Our approach markedly enhances the robustness across diverse scenarios, encompassing various parameter-efficient fine-tuning methods.
arXiv Detail & Related papers (2024-01-25T09:18:51Z) - Fundamental limits of community detection from multi-view data:
multi-layer, dynamic and partially labeled block models [7.778975741303385]
We study community detection in multi-view data in modern network analysis.
We characterize the mutual information between the data and the latent parameters.
We introduce iterative algorithms based on Approximate Message Passing for community detection.
arXiv Detail & Related papers (2024-01-16T07:13:32Z) - Distributionally Robust Model-based Reinforcement Learning with Large
State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment.
We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets.
We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z) - Multilayer Multiset Neuronal Networks -- MMNNs [55.2480439325792]
The present work describes multilayer multiset neuronal networks incorporating two or more layers of coincidence similarity neurons.
The work also explores the utilization of counter-prototype points, which are assigned to the image regions to be avoided.
arXiv Detail & Related papers (2023-08-28T12:55:13Z) - Multi-View Clustering via Semi-non-negative Tensor Factorization [120.87318230985653]
We develop a novel multi-view clustering based on semi-non-negative tensor factorization (Semi-NTF)
Our model directly considers the between-view relationship and exploits the between-view complementary information.
In addition, we provide an optimization algorithm for the proposed method and prove mathematically that the algorithm always converges to the stationary KKT point.
arXiv Detail & Related papers (2023-03-29T14:54:19Z) - Scalable Hierarchical Over-the-Air Federated Learning [3.8798345704175534]
This work introduces a new two-level learning method designed to handle both interference and device data heterogeneity.
We present a comprehensive mathematical approach to derive the convergence bound for the proposed algorithm.
Despite the interference and data heterogeneity, the proposed algorithm achieves high learning accuracy for a variety of parameters.
arXiv Detail & Related papers (2022-11-29T12:46:37Z) - Unified Multi-View Orthonormal Non-Negative Graph Based Clustering
Framework [74.25493157757943]
We formulate a novel clustering model, which exploits the non-negative feature property and incorporates the multi-view information into a unified joint learning framework.
We also explore, for the first time, the multi-model non-negative graph-based approach to clustering data based on deep features.
arXiv Detail & Related papers (2022-11-03T08:18:27Z) - Linear Connectivity Reveals Generalization Strategies [54.947772002394736]
Some pairs of finetuned models have large barriers of increasing loss on the linear paths between them.
We find distinct clusters of models which are linearly connected on the test loss surface, but are disconnected from models outside the cluster.
Our work demonstrates how the geometry of the loss surface can guide models towards different functions.
arXiv Detail & Related papers (2022-05-24T23:43:02Z) - Semi-Supervised Clustering of Sparse Graphs: Crossing the
Information-Theoretic Threshold [3.6052935394000234]
Block model is a canonical random graph model for clustering and community detection on network-structured data.
No estimator based on the network topology can perform substantially better than chance on sparse graphs if the model parameter is below a certain threshold.
We prove that with an arbitrary fraction of the labels feasible throughout the parameter domain.
arXiv Detail & Related papers (2022-05-24T00:03:25Z) - Scalable Regularised Joint Mixture Models [2.0686407686198263]
In many applications, data can be heterogeneous in the sense of spanning latent groups with different underlying distributions.
We propose an approach for heterogeneous data that allows joint learning of (i) explicit multivariate feature distributions, (ii) high-dimensional regression models and (iii) latent group labels.
The approach is demonstrably effective in high dimensions, combining data reduction for computational efficiency with a re-weighting scheme that retains key signals even when the number of features is large.
arXiv Detail & Related papers (2022-05-03T13:38:58Z) - Exact Clustering in Tensor Block Model: Statistical Optimality and
Computational Limit [10.8145995157397]
High-order clustering aims to identify heterogeneous substructure in multiway dataset.
Non- computation and nature of the problem poses significant challenges in both statistics and statistics.
arXiv Detail & Related papers (2020-12-18T00:48:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.