Latent class analysis for multi-layer categorical data
- URL: http://arxiv.org/abs/2408.05535v1
- Date: Sat, 10 Aug 2024 12:31:31 GMT
- Title: Latent class analysis for multi-layer categorical data
- Authors: Huan Qing,
- Abstract summary: This paper considers a more general case, multi-layer categorical data with polytomous responses.
We present a novel statistical model, the multi-layer latent class model (multi-layer LCM)
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditional categorical data, often collected in psychological tests and educational assessments, are typically single-layer and gathered only once.This paper considers a more general case, multi-layer categorical data with polytomous responses. To model such data, we present a novel statistical model, the multi-layer latent class model (multi-layer LCM). This model assumes that all layers share common subjects and items. To discover subjects' latent classes and other model parameters under this model, we develop three efficient spectral methods based on the sum of response matrices, the sum of Gram matrices, and the debiased sum of Gram matrices, respectively. Within the framework of multi-layer LCM, we demonstrate the estimation consistency of these methods under mild conditions regarding data sparsity. Our theoretical findings reveal two key insights: (1) increasing the number of layers can enhance the performance of the proposed methods, highlighting the advantages of considering multiple layers in latent class analysis; (2) we theoretically show that the algorithm based on the debiased sum of Gram matrices usually performs best. Additionally, we propose an approach that combines the averaged modularity metric with our methods to determine the number of latent classes. Extensive experiments are conducted to support our theoretical results and show the powerfulness of our methods in the task of learning latent classes and estimating the number of latent classes in multi-layer categorical data with polytomous responses.
Related papers
- Generalized Grade-of-Membership Estimation for High-dimensional Locally Dependent Data [6.626575011678484]
Mixed membership models are widely used for analyzing survey responses and population genetics data.
Existing approaches, such as Bayesian MCMC inference, are not scalable and lack theoretical guarantees in high-dimensional settings.
We introduce a novel and simple approach that flattens the three-way quasi-tensor into a "fat" matrix, and then perform a singular value decomposition of it to estimate parameters.
arXiv Detail & Related papers (2024-12-27T18:51:15Z) - Multi-layer matrix factorization for cancer subtyping using full and partial multi-omics dataset [3.110068567404913]
This paper introduces Multi-Layer Matrix Factorization (MLMF), a novel approach for cancer subtyping that employs multi-omics data clustering.
experiments conducted on 10 multi-omics cancer datasets, both complete and with missing values, demonstrate thatF achieves results comparable to or surpass the performance of several state-of-the-art approaches.
arXiv Detail & Related papers (2024-11-18T20:58:11Z) - Ensemble Methods for Sequence Classification with Hidden Markov Models [8.241486511994202]
We present a lightweight approach to sequence classification using Ensemble Methods for Hidden Markov Models (HMMs)
HMMs offer significant advantages in scenarios with imbalanced or smaller datasets due to their simplicity, interpretability, and efficiency.
Our ensemble-based scoring method enables the comparison of sequences of any length and improves performance on imbalanced datasets.
arXiv Detail & Related papers (2024-09-11T20:59:32Z) - Synergistic eigenanalysis of covariance and Hessian matrices for enhanced binary classification [72.77513633290056]
We present a novel approach that combines the eigenanalysis of a covariance matrix evaluated on a training set with a Hessian matrix evaluated on a deep learning model.
Our method captures intricate patterns and relationships, enhancing classification performance.
arXiv Detail & Related papers (2024-02-14T16:10:42Z) - Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation.
We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z) - Latent class analysis by regularized spectral clustering [0.0]
We propose two new algorithms to estimate a latent class model for categorical data.
Our algorithms are developed by using a newly defined regularized Laplacian matrix calculated from the response matrix.
We further apply our algorithms to real-world categorical data with promising results.
arXiv Detail & Related papers (2023-10-28T15:09:08Z) - Learning Hierarchical Features with Joint Latent Space Energy-Based
Prior [44.4434704520236]
We study the fundamental problem of multi-layer generator models in learning hierarchical representations.
We propose a joint latent space EBM prior model with multi-layer latent variables for effective hierarchical representation learning.
arXiv Detail & Related papers (2023-10-14T15:44:14Z) - Unified Multi-View Orthonormal Non-Negative Graph Based Clustering
Framework [74.25493157757943]
We formulate a novel clustering model, which exploits the non-negative feature property and incorporates the multi-view information into a unified joint learning framework.
We also explore, for the first time, the multi-model non-negative graph-based approach to clustering data based on deep features.
arXiv Detail & Related papers (2022-11-03T08:18:27Z) - On Modality Bias Recognition and Reduction [70.69194431713825]
We study the modality bias problem in the context of multi-modal classification.
We propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned.
Our method yields remarkable performance improvements compared with the baselines.
arXiv Detail & Related papers (2022-02-25T13:47:09Z) - Generalized Matrix Factorization: efficient algorithms for fitting
generalized linear latent variable models to large data arrays [62.997667081978825]
Generalized Linear Latent Variable models (GLLVMs) generalize such factor models to non-Gaussian responses.
Current algorithms for estimating model parameters in GLLVMs require intensive computation and do not scale to large datasets.
We propose a new approach for fitting GLLVMs to high-dimensional datasets, based on approximating the model using penalized quasi-likelihood.
arXiv Detail & Related papers (2020-10-06T04:28:19Z) - Unsupervised Multi-view Clustering by Squeezing Hybrid Knowledge from
Cross View and Each View [68.88732535086338]
This paper proposes a new multi-view clustering method, low-rank subspace multi-view clustering based on adaptive graph regularization.
Experimental results for five widely used multi-view benchmarks show that our proposed algorithm surpasses other state-of-the-art methods by a clear margin.
arXiv Detail & Related papers (2020-08-23T08:25:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.