Robust Nonparametric Regression under Poisoning Attack
- URL: http://arxiv.org/abs/2305.16771v2
- Date: Mon, 11 Dec 2023 07:09:45 GMT
- Title: Robust Nonparametric Regression under Poisoning Attack
- Authors: Puning Zhao, Zhiguo Wan
- Abstract summary: An adversarial attacker can modify the values of up to $q$ samples from a training dataset of size $N$.
Our initial solution is an M-estimator based on Huber loss minimization.
The final estimate is nearly minimax optimal for arbitrary $q$, up to a $ln N$ factor.
- Score: 13.470899588917716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies robust nonparametric regression, in which an adversarial
attacker can modify the values of up to $q$ samples from a training dataset of
size $N$. Our initial solution is an M-estimator based on Huber loss
minimization. Compared with simple kernel regression, i.e. the Nadaraya-Watson
estimator, this method can significantly weaken the impact of malicious samples
on the regression performance. We provide the convergence rate as well as the
corresponding minimax lower bound. The result shows that, with proper bandwidth
selection, $\ell_\infty$ error is minimax optimal. The $\ell_2$ error is
optimal with relatively small $q$, but is suboptimal with larger $q$. The
reason is that this estimator is vulnerable if there are many attacked samples
concentrating in a small region. To address this issue, we propose a correction
method by projecting the initial estimate to the space of Lipschitz functions.
The final estimate is nearly minimax optimal for arbitrary $q$, up to a $\ln N$
factor.
Related papers
- Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic
Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel.
An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost.
Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z) - Bayes beats Cross Validation: Efficient and Accurate Ridge Regression
via Expectation Maximization [3.061662434597098]
We present a method for the regularization hyper- parameter, $lambda$, that is faster to compute than leave-one-out cross-validation (LOOCV)
We show that the proposed method is guaranteed to find a unique optimal solution for large enough $n$, under relatively mild conditions.
arXiv Detail & Related papers (2023-10-29T01:13:55Z) - Robust Sparse Mean Estimation via Incremental Learning [15.536082641659423]
In this paper, we study the problem of robust mean estimation, where the goal is to estimate a $k$-sparse mean from a collection of partially corrupted samples.
We present a simple mean estimator that overcomes both challenges under moderate conditions.
Our method does not need any prior knowledge of the sparsity level $k$.
arXiv Detail & Related papers (2023-05-24T16:02:28Z) - Estimating Optimal Policy Value in General Linear Contextual Bandits [50.008542459050155]
In many bandit problems, the maximal reward achievable by a policy is often unknown in advance.
We consider the problem of estimating the optimal policy value in the sublinear data regime before the optimal policy is even learnable.
We present a more practical, computationally efficient algorithm that estimates a problem-dependent upper bound on $V*$.
arXiv Detail & Related papers (2023-02-19T01:09:24Z) - Estimating the minimizer and the minimum value of a regression function
under passive design [72.85024381807466]
We propose a new method for estimating the minimizer $boldsymbolx*$ and the minimum value $f*$ of a smooth and strongly convex regression function $f$.
We derive non-asymptotic upper bounds for the quadratic risk and optimization error of $boldsymbolz_n$, and for the risk of estimating $f*$.
arXiv Detail & Related papers (2022-11-29T18:38:40Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - Consistent Estimation for PCA and Sparse Regression with Oblivious
Outliers [13.244654316770815]
We develop machinery to design efficiently computable and consistent estimators.
For sparse regression, we achieve consistency for optimal sample size $ngsim (klog d)/alpha2$.
In the context of PCA, we attain optimal error guarantees under broad spikiness assumptions on the parameter matrix.
arXiv Detail & Related papers (2021-11-04T15:59:44Z) - Online nonparametric regression with Sobolev kernels [99.12817345416846]
We derive the regret upper bounds on the classes of Sobolev spaces $W_pbeta(mathcalX)$, $pgeq 2, beta>fracdp$.
The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> fracd2$ or $p=infty$ these rates are (essentially) optimal.
arXiv Detail & Related papers (2021-02-06T15:05:14Z) - Outlier-robust sparse/low-rank least-squares regression and robust
matrix completion [1.0878040851637998]
We study high-dimensional least-squares regression within a subgaussian statistical learning framework with heterogeneous noise.
We also present a novel theory of trace-regression with matrix decomposition based on a new application of the product process.
arXiv Detail & Related papers (2020-12-12T07:42:47Z) - Sharp Statistical Guarantees for Adversarially Robust Gaussian
Classification [54.22421582955454]
We provide the first result of the optimal minimax guarantees for the excess risk for adversarially robust classification.
Results are stated in terms of the Adversarial Signal-to-Noise Ratio (AdvSNR), which generalizes a similar notion for standard linear classification to the adversarial setting.
arXiv Detail & Related papers (2020-06-29T21:06:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.