Related papers: Error Bound of Empirical $\ell_2$ Risk Minimization for Noisy Standard and Generalized Phase Retrieval Problems

Error Bound of Empirical $\ell_2$ Risk Minimization for Noisy Standard and Generalized Phase Retrieval Problems

URL: http://arxiv.org/abs/2205.13827v1
Date: Fri, 27 May 2022 08:38:33 GMT
Title: Error Bound of Empirical $\ell_2$ Risk Minimization for Noisy Standard and Generalized Phase Retrieval Problems
Authors: Junren Chen, Michael K. Ng
Abstract summary: A noisy phase retrieval (NGPR) problem refers to a problem of estimating $x_0 in mathbbCd$ by noisy quadratic samples. Under different kinds of noise patterns, we establish error bounds that can imply approximate reconstruction. We show the bounds are $Obig(frac||etasqrtnbig)$ and $Obig(sqrtfracdnbig)$ for general noise.
Score: 18.81683506516341
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A noisy generalized phase retrieval (NGPR) problem refers to a problem of estimating $x_0 \in \mathbb{C}^d$ by noisy quadratic samples $\big\{x_0^*A_kx_0+\eta_k\big\}_{k=1}^n$ where $A_k$ is a Hermitian matrix and $\eta_k$ is a noise scalar. When $A_k=\alpha_k\alpha_k^*$ for some $\alpha_k\in\mathbb{C}^d$, it reduces to a standard noisy phase retrieval (NPR) problem. The main aim of this paper is to study the estimation performance of empirical $\ell_2$ risk minimization in both problems when $A_k$ in NGPR, or $\alpha_k$ in NPR, is drawn from sub-Gaussian distribution. Under different kinds of noise patterns, we establish error bounds that can imply approximate reconstruction and these results are new in the literature. In NGPR, we show the bounds are of $O\big(\frac{||\eta||}{\sqrt{n}}\big)$ and $O\big(||\eta||_\infty \sqrt{\frac{d}{n}}\big)$ for general noise, and of $O\big(\sqrt{\frac{d\log n}{n}}\big)$ and $O\big(\sqrt{\frac{d(\log n)^2}{n}}\big)$ for random noise with sub-Gaussian and sub-exponential tail respectively, where $\| \eta \|$ and $\| \eta \|_{\infty}$ are the 2-norm and sup-norm of the noise vector of $\eta_k$. Under heavy-tailed noise, by truncating response outliers we propose a robust estimator that possesses an error bound with slower convergence rate. On the other hand, we obtain in NPR the bound is of $O\big(\sqrt{\frac{d\log n}{n}}\big)$ and $O\big(\sqrt{\frac{d(\log n)^2}{n}}\big)$) for sub-Gaussian and sub-exponential noise respectively, which is essentially tighter than the existing bound $O\big(\frac{||\eta||_2}{\sqrt{n}}\big)$. Although NGPR involving measurement matrix $A_k$ is more computationally demanding than NPR involving measurement vector $\alpha_k$, our results reveal that NGPR exhibits stronger robustness than NPR under biased and deterministic noise. Experimental results are presented to confirm and demonstrate our theoretical findings.

Related papers

On the $O(\rac{\sqrt{d}}{K^{1/4}})$ Convergence Rate of AdamW Measured by $\ell_1$ Norm [54.28350823319057]
This paper establishes the convergence rate $frac1Ksum_k=1KEleft[|nabla f(xk)|_1right]leq O(fracsqrtdCK1/4) for AdamW measured by $ell_$ norm, where $K$ represents the iteration number, $d denotes the model dimension, and $C$ matches the constant in the optimal convergence rate of SGD.
arXiv Detail & Related papers (2025-05-17T05:02:52Z)
Heavy-Tailed Linear Bandits: Huber Regression with One-Pass Update [62.96781471194877]
Two principled strategies for handling heavy-tailed noise, truncation and median-of-means, have been introduced to heavy-tailed bandits. We propose a emphone-pass algorithm based on the online mirror descent framework.
arXiv Detail & Related papers (2025-03-01T09:41:45Z)
Sign Operator for Coping with Heavy-Tailed Noise in Non-Convex Optimization: High Probability Bounds Under $(L_0, L_1)$-Smoothness [74.18546828528298]
We show that SignSGD with Majority Voting can robustly work on the whole range of complexity with $kappakappakappakappa-1right, kappakappakappa-1right, kappakappakappa-1right, kappakappakappa-1right, kappakappakappa-1right, kappakappakappa-1right, kappakappakappa-1right, kappa
arXiv Detail & Related papers (2025-02-11T19:54:11Z)
On robust recovery of signals from indirect observations [0.24578723416255752]
We show how "presumably good" estimates of the sort may be constructed in the situation where the signal set is a convex set. We consider two "uncertainty setups" in which $cal N$ is either a convex bounded set or is the set of sparse vectors.
arXiv Detail & Related papers (2025-01-03T18:17:47Z)
Revisiting Step-Size Assumptions in Stochastic Approximation [1.3654846342364308]
It is shown for the first time that this assumption is not required for convergence and finer results.<n>Rates of convergence are obtained for the standard algorithm and for estimates obtained via the averaging technique of Polyak and Ruppert.<n>Results from numerical experiments illustrate that $beta_theta$ may be large due to a combination of multiplicative noise and Markovian memory.
arXiv Detail & Related papers (2024-05-28T05:11:05Z)
On the $O(\rac{\sqrt{d}}{T^{1/4}})$ Convergence Rate of RMSProp and Its Momentum Extension Measured by $\ell_1$ Norm [54.28350823319057]
This paper considers the RMSProp and its momentum extension and establishes the convergence rate of $frac1Tsum_k=1T.<n>Our convergence rate matches the lower bound with respect to all the coefficients except the dimension $d$.<n>Our convergence rate can be considered to be analogous to the $frac1Tsum_k=1T.
arXiv Detail & Related papers (2024-02-01T07:21:32Z)
Spectral Statistics of the Sample Covariance Matrix for High Dimensional Linear Gaussians [12.524855369455421]
Performance of ordinary least squares(OLS) method for the emphestimation of high dimensional stable state transition matrix. OLS estimator incurs a emphphase transition and becomes emphtransient: increasing only worsens estimation error.
arXiv Detail & Related papers (2023-12-10T06:55:37Z)
A Unified Framework for Uniform Signal Recovery in Nonlinear Generative Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously. Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples. We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z)
Distribution-Independent Regression for Generalized Linear Models with Oblivious Corruptions [49.69852011882769]
We show the first algorithms for the problem of regression for generalized linear models (GLMs) in the presence of additive oblivious noise. We present an algorithm that tackles newthis problem in its most general distribution-independent setting. This is the first newalgorithmic result for GLM regression newwith oblivious noise which can handle more than half the samples being arbitrarily corrupted.
arXiv Detail & Related papers (2023-09-20T21:41:59Z)
Near-Optimal Bounds for Learning Gaussian Halfspaces with Random Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$. Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z)
Asymptotically Optimal Pure Exploration for Infinite-Armed Bandits [4.811176167998627]
We study pure exploration with infinitely many bandit arms generated i.i.d. from an unknown distribution. Our goal is to efficiently select a single high quality arm whose average reward is, with probability $1-delta$, within $varepsilon$ of being among the top $eta$-fraction of arms.
arXiv Detail & Related papers (2023-06-03T04:00:47Z)
Sparse Recovery with Shuffled Labels: Statistical Limits and Practical Estimators [23.313461266708877]
We reconstruct the permutation matrix $bPitrue$ and the sparse signal $bbetatrue$ from shuffled labels. We show that our proposed estimator can obtain the ground-truth $(bPitrue, supp(bbetatrue))$ under mild conditions.
arXiv Detail & Related papers (2023-03-20T16:14:58Z)
Faster Rates of Convergence to Stationary Points in Differentially Private Optimization [31.46358820048179]
We study the problem of approximating stationary points of Lipschitz and smooth functions under $(varepsilon,delta)$-differential privacy (DP) A point $widehatw$ is called an $alpha$-stationary point of a function $mathbbRdrightarrowmathbbR$ if $|nabla F(widehatw)|leq alpha$. We provide a new efficient algorithm that finds an $tildeObig(big[
arXiv Detail & Related papers (2022-06-02T02:43:44Z)
Active Sampling for Linear Regression Beyond the $\ell_2$ Norm [70.49273459706546]
We study active sampling algorithms for linear regression, which aim to query only a small number of entries of a target vector. We show that this dependence on $d$ is optimal, up to logarithmic factors. We also provide the first total sensitivity upper bound $O(dmax1,p/2log2 n)$ for loss functions with at most degree $p$ growth.
arXiv Detail & Related papers (2021-11-09T00:20:01Z)
Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals. Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z)
On the Power of Localized Perceptron for Label-Optimal Learning of Halfspaces with Adversarial Noise [4.8728183994912415]
We study em online active learning of homogeneous halfspaces in $mathbbRd$ with adversarial noise. Our main contribution is a Perceptron-like active learning algorithm that runs in time. Our algorithm learns the underlying halfspace with near-optimal label complexity.
arXiv Detail & Related papers (2020-12-19T22:04:10Z)
On the robustness of the minimum $\ell_2$ interpolator [2.918940961856197]
We analyse the interpolator with minimal $ell$-norm $hatbeta$ in a general high dimensional linear regression framework. We prove that, with high probability, the prediction loss of this estimator is bounded from above by $(|beta*|2r_cn(Sigma)vee |xi|2)/n$, where $r_k(Sigma)sum_igeq klambda_i(Sigma)$ are the rests of the
arXiv Detail & Related papers (2020-03-12T15:12:28Z)
Tight Regret Bounds for Noisy Optimization of a Brownian Motion [118.65407541895851]
We consider the problem of Bayesian optimization of a one-dimensional Brownian motion in which the $T$ adaptively chosen observations are corrupted by Gaussian noise. We show that as the smallest possible expected cumulative regret and the smallest possible expected simple regret scale as $Omega(sigmasqrtT cdot log T)$.
arXiv Detail & Related papers (2020-01-25T14:44:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.