Error Bound of Empirical $\ell_2$ Risk Minimization for Noisy Standard
and Generalized Phase Retrieval Problems
- URL: http://arxiv.org/abs/2205.13827v1
- Date: Fri, 27 May 2022 08:38:33 GMT
- Title: Error Bound of Empirical $\ell_2$ Risk Minimization for Noisy Standard
and Generalized Phase Retrieval Problems
- Authors: Junren Chen, Michael K. Ng
- Abstract summary: A noisy phase retrieval (NGPR) problem refers to a problem of estimating $x_0 in mathbbCd$ by noisy quadratic samples.
Under different kinds of noise patterns, we establish error bounds that can imply approximate reconstruction.
We show the bounds are $Obig(frac||etasqrtnbig)$ and $Obig(sqrtfracdnbig)$ for general noise.
- Score: 18.81683506516341
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A noisy generalized phase retrieval (NGPR) problem refers to a problem of
estimating $x_0 \in \mathbb{C}^d$ by noisy quadratic samples
$\big\{x_0^*A_kx_0+\eta_k\big\}_{k=1}^n$ where $A_k$ is a Hermitian matrix and
$\eta_k$ is a noise scalar. When $A_k=\alpha_k\alpha_k^*$ for some
$\alpha_k\in\mathbb{C}^d$, it reduces to a standard noisy phase retrieval (NPR)
problem. The main aim of this paper is to study the estimation performance of
empirical $\ell_2$ risk minimization in both problems when $A_k$ in NGPR, or
$\alpha_k$ in NPR, is drawn from sub-Gaussian distribution. Under different
kinds of noise patterns, we establish error bounds that can imply approximate
reconstruction and these results are new in the literature. In NGPR, we show
the bounds are of $O\big(\frac{||\eta||}{\sqrt{n}}\big)$ and
$O\big(||\eta||_\infty \sqrt{\frac{d}{n}}\big)$ for general noise, and of
$O\big(\sqrt{\frac{d\log n}{n}}\big)$ and $O\big(\sqrt{\frac{d(\log
n)^2}{n}}\big)$ for random noise with sub-Gaussian and sub-exponential tail
respectively, where $\| \eta \|$ and $\| \eta \|_{\infty}$ are the 2-norm and
sup-norm of the noise vector of $\eta_k$. Under heavy-tailed noise, by
truncating response outliers we propose a robust estimator that possesses an
error bound with slower convergence rate. On the other hand, we obtain in NPR
the bound is of $O\big(\sqrt{\frac{d\log n}{n}}\big)$ and
$O\big(\sqrt{\frac{d(\log n)^2}{n}}\big)$) for sub-Gaussian and sub-exponential
noise respectively, which is essentially tighter than the existing bound
$O\big(\frac{||\eta||_2}{\sqrt{n}}\big)$. Although NGPR involving measurement
matrix $A_k$ is more computationally demanding than NPR involving measurement
vector $\alpha_k$, our results reveal that NGPR exhibits stronger robustness
than NPR under biased and deterministic noise. Experimental results are
presented to confirm and demonstrate our theoretical findings.
Related papers
- On the $O(\rac{\sqrt{d}}{K^{1/4}})$ Convergence Rate of AdamW Measured by $\ell_1$ Norm [54.28350823319057]
This paper establishes the convergence rate $frac1Ksum_k=1KEleft[|nabla f(xk)|_1right]leq O(fracsqrtdCK1/4) for AdamW measured by $ell_$ norm, where $K$ represents the iteration number, $d denotes the model dimension, and $C$ matches the constant in the optimal convergence rate of SGD.
arXiv Detail & Related papers (2025-05-17T05:02:52Z) - Heavy-Tailed Linear Bandits: Huber Regression with One-Pass Update [62.96781471194877]
Two principled strategies for handling heavy-tailed noise, truncation and median-of-means, have been introduced to heavy-tailed bandits.
We propose a emphone-pass algorithm based on the online mirror descent framework.
arXiv Detail & Related papers (2025-03-01T09:41:45Z) - Sign Operator for Coping with Heavy-Tailed Noise in Non-Convex Optimization: High Probability Bounds Under $(L_0, L_1)$-Smoothness [74.18546828528298]
We show that SignSGD with Majority Voting can robustly work on the whole range of complexity with $kappakappakappakappa-1right, kappakappakappa-1right, kappakappakappa-1right, kappakappakappa-1right, kappakappakappa-1right, kappakappakappa-1right, kappakappakappa-1right, kappa
arXiv Detail & Related papers (2025-02-11T19:54:11Z) - On robust recovery of signals from indirect observations [0.24578723416255752]
We show how "presumably good" estimates of the sort may be constructed in the situation where the signal set is a convex set.
We consider two "uncertainty setups" in which $cal N$ is either a convex bounded set or is the set of sparse vectors.
arXiv Detail & Related papers (2025-01-03T18:17:47Z) - Revisiting Step-Size Assumptions in Stochastic Approximation [1.3654846342364308]
It is shown for the first time that this assumption is not required for convergence and finer results.<n>Rates of convergence are obtained for the standard algorithm and for estimates obtained via the averaging technique of Polyak and Ruppert.<n>Results from numerical experiments illustrate that $beta_theta$ may be large due to a combination of multiplicative noise and Markovian memory.
arXiv Detail & Related papers (2024-05-28T05:11:05Z) - On the $O(\rac{\sqrt{d}}{T^{1/4}})$ Convergence Rate of RMSProp and Its Momentum Extension Measured by $\ell_1$ Norm [54.28350823319057]
This paper considers the RMSProp and its momentum extension and establishes the convergence rate of $frac1Tsum_k=1T.<n>Our convergence rate matches the lower bound with respect to all the coefficients except the dimension $d$.<n>Our convergence rate can be considered to be analogous to the $frac1Tsum_k=1T.
arXiv Detail & Related papers (2024-02-01T07:21:32Z) - Spectral Statistics of the Sample Covariance Matrix for High Dimensional
Linear Gaussians [12.524855369455421]
Performance of ordinary least squares(OLS) method for the emphestimation of high dimensional stable state transition matrix.
OLS estimator incurs a emphphase transition and becomes emphtransient: increasing only worsens estimation error.
arXiv Detail & Related papers (2023-12-10T06:55:37Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Distribution-Independent Regression for Generalized Linear Models with
Oblivious Corruptions [49.69852011882769]
We show the first algorithms for the problem of regression for generalized linear models (GLMs) in the presence of additive oblivious noise.
We present an algorithm that tackles newthis problem in its most general distribution-independent setting.
This is the first newalgorithmic result for GLM regression newwith oblivious noise which can handle more than half the samples being arbitrarily corrupted.
arXiv Detail & Related papers (2023-09-20T21:41:59Z) - Near-Optimal Bounds for Learning Gaussian Halfspaces with Random
Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$.
Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z) - Asymptotically Optimal Pure Exploration for Infinite-Armed Bandits [4.811176167998627]
We study pure exploration with infinitely many bandit arms generated i.i.d. from an unknown distribution.
Our goal is to efficiently select a single high quality arm whose average reward is, with probability $1-delta$, within $varepsilon$ of being among the top $eta$-fraction of arms.
arXiv Detail & Related papers (2023-06-03T04:00:47Z) - Sparse Recovery with Shuffled Labels: Statistical Limits and Practical
Estimators [23.313461266708877]
We reconstruct the permutation matrix $bPitrue$ and the sparse signal $bbetatrue$ from shuffled labels.
We show that our proposed estimator can obtain the ground-truth $(bPitrue, supp(bbetatrue))$ under mild conditions.
arXiv Detail & Related papers (2023-03-20T16:14:58Z) - Faster Rates of Convergence to Stationary Points in Differentially
Private Optimization [31.46358820048179]
We study the problem of approximating stationary points of Lipschitz and smooth functions under $(varepsilon,delta)$-differential privacy (DP)
A point $widehatw$ is called an $alpha$-stationary point of a function $mathbbRdrightarrowmathbbR$ if $|nabla F(widehatw)|leq alpha$.
We provide a new efficient algorithm that finds an $tildeObig(big[
arXiv Detail & Related papers (2022-06-02T02:43:44Z) - Active Sampling for Linear Regression Beyond the $\ell_2$ Norm [70.49273459706546]
We study active sampling algorithms for linear regression, which aim to query only a small number of entries of a target vector.
We show that this dependence on $d$ is optimal, up to logarithmic factors.
We also provide the first total sensitivity upper bound $O(dmax1,p/2log2 n)$ for loss functions with at most degree $p$ growth.
arXiv Detail & Related papers (2021-11-09T00:20:01Z) - Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals.
Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z) - On the Power of Localized Perceptron for Label-Optimal Learning of
Halfspaces with Adversarial Noise [4.8728183994912415]
We study em online active learning of homogeneous halfspaces in $mathbbRd$ with adversarial noise.
Our main contribution is a Perceptron-like active learning algorithm that runs in time.
Our algorithm learns the underlying halfspace with near-optimal label complexity.
arXiv Detail & Related papers (2020-12-19T22:04:10Z) - On the robustness of the minimum $\ell_2$ interpolator [2.918940961856197]
We analyse the interpolator with minimal $ell$-norm $hatbeta$ in a general high dimensional linear regression framework.
We prove that, with high probability, the prediction loss of this estimator is bounded from above by $(|beta*|2r_cn(Sigma)vee |xi|2)/n$, where $r_k(Sigma)sum_igeq klambda_i(Sigma)$ are the rests of the
arXiv Detail & Related papers (2020-03-12T15:12:28Z) - Tight Regret Bounds for Noisy Optimization of a Brownian Motion [118.65407541895851]
We consider the problem of Bayesian optimization of a one-dimensional Brownian motion in which the $T$ adaptively chosen observations are corrupted by Gaussian noise.
We show that as the smallest possible expected cumulative regret and the smallest possible expected simple regret scale as $Omega(sigmasqrtT cdot log T)$.
arXiv Detail & Related papers (2020-01-25T14:44:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.