Statistical Inference in Tensor Completion: Optimal Uncertainty Quantification and Statistical-to-Computational Gaps
- URL: http://arxiv.org/abs/2410.11225v2
- Date: Fri, 01 Nov 2024 15:51:56 GMT
- Title: Statistical Inference in Tensor Completion: Optimal Uncertainty Quantification and Statistical-to-Computational Gaps
- Authors: Wanteng Ma, Dong Xia,
- Abstract summary: This paper presents a simple yet efficient method for statistical inference of tensor linear forms using incomplete and noisy observations.
It is suitable for various statistical inference tasks, including constructing confidence intervals, inference under heteroskedastic and sub-exponential noise, and simultaneous testing.
- Score: 7.174572371800217
- License:
- Abstract: This paper presents a simple yet efficient method for statistical inference of tensor linear forms using incomplete and noisy observations. Under the Tucker low-rank tensor model and the missing-at-random assumption, we utilize an appropriate initial estimate along with a debiasing technique followed by a one-step power iteration to construct an asymptotically normal test statistic. This method is suitable for various statistical inference tasks, including constructing confidence intervals, inference under heteroskedastic and sub-exponential noise, and simultaneous testing. We demonstrate that the estimator achieves the Cram\'er-Rao lower bound on Riemannian manifolds, indicating its optimality in uncertainty quantification. We comprehensively examine the statistical-to-computational gaps and investigate the impact of initialization on the minimal conditions regarding sample size and signal-to-noise ratio required for accurate inference. Our findings show that with independent initialization, statistically optimal sample sizes and signal-to-noise ratios are sufficient for accurate inference. Conversely, if only dependent initialization is available, computationally optimal sample sizes and signal-to-noise ratio conditions still guarantee asymptotic normality without the need for data-splitting. We present the phase transition between computational and statistical limits. Numerical simulation results align with the theoretical findings.
Related papers
- Robust Estimation for Kernel Exponential Families with Smoothed Total Variation Distances [2.317910166616341]
In statistical inference, we commonly assume that samples are independent and identically distributed from a probability distribution.
In this paper, we explore the application of GAN-like estimators to a general class of statistical models.
arXiv Detail & Related papers (2024-10-28T05:50:47Z) - Bayesian Nonparametrics Meets Data-Driven Distributionally Robust Optimization [29.24821214671497]
Training machine learning and statistical models often involve optimizing a data-driven risk criterion.
We propose a novel robust criterion by combining insights from Bayesian nonparametric (i.e., Dirichlet process) theory and a recent decision-theoretic model of smooth ambiguity-averse preferences.
For practical implementation, we propose and study tractable approximations of the criterion based on well-known Dirichlet process representations.
arXiv Detail & Related papers (2024-01-28T21:19:15Z) - Communication-Efficient Distributed Estimation and Inference for Cox's Model [4.731404257629232]
We develop communication-efficient iterative distributed algorithms for estimation and inference in the high-dimensional sparse Cox proportional hazards model.
To construct confidence intervals for linear combinations of high-dimensional hazard regression coefficients, we introduce a novel debiased method.
We provide valid and powerful distributed hypothesis tests for any coordinate element based on a decorrelated score test.
arXiv Detail & Related papers (2023-02-23T15:50:17Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Nonparametric Conditional Local Independence Testing [69.31200003384122]
Conditional local independence is an independence relation among continuous time processes.
No nonparametric test of conditional local independence has been available.
We propose such a nonparametric test based on double machine learning.
arXiv Detail & Related papers (2022-03-25T10:31:02Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner.
We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation.
We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z) - Inference for Low-rank Tensors -- No Need to Debias [22.163281794187544]
In this paper, we consider the statistical inference for several low-rank tensor models.
For the rank-one PCA model, we establish the theory for inference on each individual singular tensor.
Finally, simulations are presented to corroborate our theoretical discoveries.
arXiv Detail & Related papers (2020-12-29T16:48:02Z) - Amortized Conditional Normalized Maximum Likelihood: Reliable Out of
Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation.
Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle.
We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.