A Generalized Latent Factor Model Approach to Mixed-data Matrix
Completion with Entrywise Consistency
- URL: http://arxiv.org/abs/2211.09272v1
- Date: Thu, 17 Nov 2022 00:24:47 GMT
- Title: A Generalized Latent Factor Model Approach to Mixed-data Matrix
Completion with Entrywise Consistency
- Authors: Yunxiao Chen, Xiaoou Li
- Abstract summary: Matrix completion is a class of machine learning methods that concerns the prediction of missing entries in a partially observed matrix.
We formulate it as a low-rank matrix estimation problem under a general family of non-linear factor models.
We propose entrywise consistent estimators for estimating the low-rank matrix.
- Score: 3.299672391663527
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Matrix completion is a class of machine learning methods that concerns the
prediction of missing entries in a partially observed matrix. This paper
studies matrix completion for mixed data, i.e., data involving mixed types of
variables (e.g., continuous, binary, ordinal). We formulate it as a low-rank
matrix estimation problem under a general family of non-linear factor models
and then propose entrywise consistent estimators for estimating the low-rank
matrix. Tight probabilistic error bounds are derived for the proposed
estimators. The proposed methods are evaluated by simulation studies and
real-data applications for collaborative filtering and large-scale educational
assessment.
Related papers
- Statistical Inference For Noisy Matrix Completion Incorporating Auxiliary Information [3.9748528039819977]
This paper investigates statistical inference for noisy matrix completion in a semi-supervised model.
We apply an iterative least squares (LS) estimation approach in our considered context.
We show that our method only needs a few iterations, and the resulting entry-wise estimators of the low-rank matrix and the coefficient matrix are guaranteed to have normal distributions.
arXiv Detail & Related papers (2024-03-22T01:06:36Z) - Mixed Matrix Completion in Complex Survey Sampling under Heterogeneous
Missingness [6.278498348219109]
We propose a fast and scalable estimation algorithm that achieves sublinear convergence.
The proposed method is applied to analyze the National Health and Nutrition Examination Survey data.
arXiv Detail & Related papers (2024-02-06T12:26:58Z) - Adaptive Estimation of Graphical Models under Total Positivity [13.47131471222723]
We consider the problem of estimating (diagonally dominant) M-matrices as precision matrices in Gaussian graphical models.
We propose an adaptive multiple-stage estimation method that refines the estimate.
We develop a unified framework based on the gradient projection method to solve the regularized problem.
arXiv Detail & Related papers (2022-10-27T14:21:27Z) - Learning Graphical Factor Models with Riemannian Optimization [70.13748170371889]
This paper proposes a flexible algorithmic framework for graph learning under low-rank structural constraints.
The problem is expressed as penalized maximum likelihood estimation of an elliptical distribution.
We leverage geometries of positive definite matrices and positive semi-definite matrices of fixed rank that are well suited to elliptical models.
arXiv Detail & Related papers (2022-10-21T13:19:45Z) - Test Set Sizing Via Random Matrix Theory [91.3755431537592]
This paper uses techniques from Random Matrix Theory to find the ideal training-testing data split for a simple linear regression.
It defines "ideal" as satisfying the integrity metric, i.e. the empirical model error is the actual measurement noise.
This paper is the first to solve for the training and test size for any model in a way that is truly optimal.
arXiv Detail & Related papers (2021-12-11T13:18:33Z) - Adversarially-Trained Nonnegative Matrix Factorization [77.34726150561087]
We consider an adversarially-trained version of the nonnegative matrix factorization.
In our formulation, an attacker adds an arbitrary matrix of bounded norm to the given data matrix.
We design efficient algorithms inspired by adversarial training to optimize for dictionary and coefficient matrices.
arXiv Detail & Related papers (2021-04-10T13:13:17Z) - Learning Mixtures of Low-Rank Models [89.39877968115833]
We study the problem of learning computational mixtures of low-rank models.
We develop an algorithm that is guaranteed to recover the unknown matrices with near-optimal sample.
In addition, the proposed algorithm is provably stable against random noise.
arXiv Detail & Related papers (2020-09-23T17:53:48Z) - Robust Low-rank Matrix Completion via an Alternating Manifold Proximal
Gradient Continuation Method [47.80060761046752]
Robust low-rank matrix completion (RMC) has been studied extensively for computer vision, signal processing and machine learning applications.
This problem aims to decompose a partially observed matrix into the superposition of a low-rank matrix and a sparse matrix, where the sparse matrix captures the grossly corrupted entries of the matrix.
A widely used approach to tackle RMC is to consider a convex formulation, which minimizes the nuclear norm of the low-rank matrix (to promote low-rankness) and the l1 norm of the sparse matrix (to promote sparsity).
In this paper, motivated by some recent works on low-
arXiv Detail & Related papers (2020-08-18T04:46:22Z) - Median Matrix Completion: from Embarrassment to Optimality [16.667260586938234]
We consider matrix completion with absolute deviation loss and obtain an estimator of the median matrix.
Despite several appealing properties of median, the non-smooth absolute deviation loss leads to computational challenge.
We propose a novel refinement step, which turns such inefficient estimators into a rate (near-optimal) matrix completion procedure.
arXiv Detail & Related papers (2020-06-18T10:01:22Z) - Robust Matrix Completion with Mixed Data Types [0.0]
We consider the problem of recovering a structured low rank matrix with partially observed entries with mixed data types.
Most approaches assume that there is only one underlying distribution and the low rank constraint is regularized by the matrix Schatten Norm.
We propose a computationally feasible statistical approach with strong recovery guarantees along with an algorithmic framework suited for parallelization to recover a low rank matrix with partially observed entries for mixed data types in one step.
arXiv Detail & Related papers (2020-05-25T21:35:10Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.