WGoM: A novel model for categorical data with weighted responses
- URL: http://arxiv.org/abs/2310.10989v1
- Date: Tue, 17 Oct 2023 04:23:31 GMT
- Title: WGoM: A novel model for categorical data with weighted responses
- Authors: Huan Qing
- Abstract summary: We introduce a novel model named the Weighted Grade of Membership (WGoM) model.
Compared with GoM, our WGoM relaxes GoM's distribution constraint on the generation of a response matrix.
We then propose an algorithm to estimate the latent mixed memberships and the other WGoM parameters.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Graded of Membership (GoM) model is a powerful tool for inferring latent
classes in categorical data, which enables subjects to belong to multiple
latent classes. However, its application is limited to categorical data with
nonnegative integer responses, making it inappropriate for datasets with
continuous or negative responses. To address this limitation, this paper
proposes a novel model named the Weighted Grade of Membership (WGoM) model.
Compared with GoM, our WGoM relaxes GoM's distribution constraint on the
generation of a response matrix and it is more general than GoM. We then
propose an algorithm to estimate the latent mixed memberships and the other
WGoM parameters. We derive the error bounds of the estimated parameters and
show that the algorithm is statistically consistent. The algorithmic
performance is validated in both synthetic and real-world datasets. The results
demonstrate that our algorithm is accurate and efficient, indicating its high
potential for practical applications. This paper makes a valuable contribution
to the literature by introducing a novel model that extends the applicability
of the GoM model and provides a more flexible framework for analyzing
categorical data with weighted responses.
Related papers
- Querying Easily Flip-flopped Samples for Deep Active Learning [63.62397322172216]
Active learning is a machine learning paradigm that aims to improve the performance of a model by strategically selecting and querying unlabeled data.
One effective selection strategy is to base it on the model's predictive uncertainty, which can be interpreted as a measure of how informative a sample is.
This paper proposes the it least disagree metric (LDM) as the smallest probability of disagreement of the predicted label.
arXiv Detail & Related papers (2024-01-18T08:12:23Z) - Minimally Supervised Learning using Topological Projections in
Self-Organizing Maps [55.31182147885694]
We introduce a semi-supervised learning approach based on topological projections in self-organizing maps (SOMs)
Our proposed method first trains SOMs on unlabeled data and then a minimal number of available labeled data points are assigned to key best matching units (BMU)
Our results indicate that the proposed minimally supervised model significantly outperforms traditional regression techniques.
arXiv Detail & Related papers (2024-01-12T22:51:48Z) - Latent class analysis with weighted responses [0.0]
We propose a novel generative model, the weighted latent class model (WLCM)
Our model allows data's response matrix to be generated from an arbitrary distribution with a latent class structure.
We investigate the identifiability of the model and propose an efficient algorithm for estimating the latent classes and other model parameters.
arXiv Detail & Related papers (2023-10-17T04:16:20Z) - Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task.
We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z) - Evaluating Representations with Readout Model Switching [18.475866691786695]
In this paper, we propose to use the Minimum Description Length (MDL) principle to devise an evaluation metric.
We design a hybrid discrete and continuous-valued model space for the readout models and employ a switching strategy to combine their predictions.
The proposed metric can be efficiently computed with an online method and we present results for pre-trained vision encoders of various architectures.
arXiv Detail & Related papers (2023-02-19T14:08:01Z) - Finding Materialized Models for Model Reuse [20.97918143614477]
Materialized model query aims to find the most appropriate materialized model as the initial model for model reuse.
We present textsfMMQ, a source-data free, general, efficient, and effective materialized model query framework.
Experiments on a range of practical model reuse workloads demonstrate the effectiveness and efficiency of textsfMMQ.
arXiv Detail & Related papers (2021-10-13T06:55:44Z) - Nonparametric Functional Analysis of Generalized Linear Models Under
Nonlinear Constraints [0.0]
This article introduces a novel nonparametric methodology for Generalized Linear Models.
It combines the strengths of the binary regression and latent variable formulations for categorical data.
It extends recently published parametric versions of the methodology and generalizes it.
arXiv Detail & Related papers (2021-10-11T04:49:59Z) - Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models.
We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs.
Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - PermuteAttack: Counterfactual Explanation of Machine Learning Credit
Scorecards [0.0]
This paper is a note on new directions and methodologies for validation and explanation of Machine Learning (ML) models employed for retail credit scoring in finance.
Our proposed framework draws motivation from the field of Artificial Intelligence (AI) security and adversarial ML.
arXiv Detail & Related papers (2020-08-24T00:05:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.