Uncertainty Quantification For Low-Rank Matrix Completion With
Heterogeneous and Sub-Exponential Noise
- URL: http://arxiv.org/abs/2110.12046v1
- Date: Fri, 22 Oct 2021 20:25:07 GMT
- Title: Uncertainty Quantification For Low-Rank Matrix Completion With
Heterogeneous and Sub-Exponential Noise
- Authors: Vivek F. Farias, Andrew A. Li, Tianyi Peng
- Abstract summary: The problem of low-rank matrix completion with heterogeneous and sub-exponential noise is particularly relevant to a number of applications in modern commerce.
Examples include panel sales data and data collected from web-commerce systems such as recommendation engines.
Here we characterize the distribution of estimated matrix entries when the observation noise is heterogeneous sub-exponential and provide, as an application, explicit formulas for this distribution.
- Score: 2.793095554369281
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problem of low-rank matrix completion with heterogeneous and
sub-exponential (as opposed to homogeneous and Gaussian) noise is particularly
relevant to a number of applications in modern commerce. Examples include panel
sales data and data collected from web-commerce systems such as recommendation
engines. An important unresolved question for this problem is characterizing
the distribution of estimated matrix entries under common low-rank estimators.
Such a characterization is essential to any application that requires
quantification of uncertainty in these estimates and has heretofore only been
available under the assumption of homogenous Gaussian noise. Here we
characterize the distribution of estimated matrix entries when the observation
noise is heterogeneous sub-exponential and provide, as an application, explicit
formulas for this distribution when observed entries are Poisson or Binary
distributed.
Related papers
- Distributional Matrix Completion via Nearest Neighbors in the Wasserstein Space [8.971989179518216]
Given a sparsely observed matrix of empirical distributions, we seek to impute the true distributions associated with both observed and unobserved matrix entries.
We utilize tools from optimal transport to generalize the nearest neighbors method to the distributional setting.
arXiv Detail & Related papers (2024-10-17T00:50:17Z) - Negative Binomial Matrix Completion [5.5415918072761805]
Matrix completion focuses on recovering missing or incomplete information in matrices.
We introduce NB matrix completion by proposing a nuclear-norm regularized model that can be solved by proximal descent gradient.
In our experiments, we demonstrate that the NB model outperforms Poisson matrix completion in various noise and missing data settings on real data.
arXiv Detail & Related papers (2024-08-28T19:43:48Z) - Intrinsic Bayesian Cramér-Rao Bound with an Application to Covariance Matrix Estimation [49.67011673289242]
This paper presents a new performance bound for estimation problems where the parameter to estimate lies in a smooth manifold.
It induces a geometry for the parameter manifold, as well as an intrinsic notion of the estimation error measure.
arXiv Detail & Related papers (2023-11-08T15:17:13Z) - Learning Linear Causal Representations from Interventions under General
Nonlinear Mixing [52.66151568785088]
We prove strong identifiability results given unknown single-node interventions without access to the intervention targets.
This is the first instance of causal identifiability from non-paired interventions for deep neural network embeddings.
arXiv Detail & Related papers (2023-06-04T02:32:12Z) - Optimizing the Noise in Self-Supervised Learning: from Importance
Sampling to Noise-Contrastive Estimation [80.07065346699005]
It is widely assumed that the optimal noise distribution should be made equal to the data distribution, as in Generative Adversarial Networks (GANs)
We turn to Noise-Contrastive Estimation which grounds this self-supervised task as an estimation problem of an energy-based model of the data.
We soberly conclude that the optimal noise may be hard to sample from, and the gain in efficiency can be modest compared to choosing the noise distribution equal to the data's.
arXiv Detail & Related papers (2023-01-23T19:57:58Z) - Sparse Nonnegative Tucker Decomposition and Completion under Noisy
Observations [22.928734507082574]
We propose a sparse nonnegative Tucker decomposition and completion method for the recovery of underlying nonnegative data under noisy observations.
Our theoretical results are better than those by existing tensor-based or matrix-based methods.
arXiv Detail & Related papers (2022-08-17T13:29:14Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - A Robust and Flexible EM Algorithm for Mixtures of Elliptical
Distributions with Missing Data [71.9573352891936]
This paper tackles the problem of missing data imputation for noisy and non-Gaussian data.
A new EM algorithm is investigated for mixtures of elliptical distributions with the property of handling potential missing data.
Experimental results on synthetic data demonstrate that the proposed algorithm is robust to outliers and can be used with non-Gaussian data.
arXiv Detail & Related papers (2022-01-28T10:01:37Z) - Spectral clustering under degree heterogeneity: a case for the random
walk Laplacian [83.79286663107845]
This paper shows that graph spectral embedding using the random walk Laplacian produces vector representations which are completely corrected for node degree.
In the special case of a degree-corrected block model, the embedding concentrates about K distinct points, representing communities.
arXiv Detail & Related papers (2021-05-03T16:36:27Z) - Robust Matrix Completion with Mixed Data Types [0.0]
We consider the problem of recovering a structured low rank matrix with partially observed entries with mixed data types.
Most approaches assume that there is only one underlying distribution and the low rank constraint is regularized by the matrix Schatten Norm.
We propose a computationally feasible statistical approach with strong recovery guarantees along with an algorithmic framework suited for parallelization to recover a low rank matrix with partially observed entries for mixed data types in one step.
arXiv Detail & Related papers (2020-05-25T21:35:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.