Sample Complexity of Adversarially Robust Linear Classification on
Separated Data
- URL: http://arxiv.org/abs/2012.10794v1
- Date: Sat, 19 Dec 2020 22:04:59 GMT
- Title: Sample Complexity of Adversarially Robust Linear Classification on
Separated Data
- Authors: Robi Bhattacharjee, Somesh Jha, Kamalika Chaudhuri
- Abstract summary: We consider the sample complexity of learning with adversarial robustness.
We show that for very well-separated data, convergence rates of $O(frac1n)$ are achievable.
- Score: 41.30425613353595
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the sample complexity of learning with adversarial robustness.
Most prior theoretical results for this problem have considered a setting where
different classes in the data are close together or overlapping. Motivated by
some real applications, we consider, in contrast, the well-separated case where
there exists a classifier with perfect accuracy and robustness, and show that
the sample complexity narrates an entirely different story. Specifically, for
linear classifiers, we show a large class of well-separated distributions where
the expected robust loss of any algorithm is at least $\Omega(\frac{d}{n})$,
whereas the max margin algorithm has expected standard loss $O(\frac{1}{n})$.
This shows a gap in the standard and robust losses that cannot be obtained via
prior techniques. Additionally, we present an algorithm that, given an instance
where the robustness radius is much smaller than the gap between the classes,
gives a solution with expected robust loss is $O(\frac{1}{n})$. This shows that
for very well-separated data, convergence rates of $O(\frac{1}{n})$ are
achievable, which is not the case otherwise. Our results apply to robustness
measured in any $\ell_p$ norm with $p > 1$ (including $p = \infty$).
Related papers
- Testable Learning with Distribution Shift [9.036777309376697]
We define a new model called testable learning with distribution shift.
We obtain provably efficient algorithms for certifying the performance of a classifier on a test distribution.
We give several positive results for learning concept classes such as halfspaces, intersections of halfspaces, and decision trees.
arXiv Detail & Related papers (2023-11-25T23:57:45Z) - Near-Optimal Bounds for Learning Gaussian Halfspaces with Random
Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$.
Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z) - Replicable Clustering [57.19013971737493]
We propose algorithms for the statistical $k$-medians, statistical $k$-means, and statistical $k$-centers problems by utilizing approximation routines for their counterparts in a black-box manner.
We also provide experiments on synthetic distributions in 2D using the $k$-means++ implementation from sklearn as a black-box that validate our theoretical results.
arXiv Detail & Related papers (2023-02-20T23:29:43Z) - When are Local Queries Useful for Robust Learning? [25.832511407411637]
We study learning models where the learner is given more power through the use of local queries.
We give the first distribution-free algorithms that perform robust empirical risk minimization.
We finish by giving robust learning algorithms for halfspaces on $0,1n$ and then obtaining robustness guarantees for halfspaces in $mathbbRn$ against precision-bounded adversaries.
arXiv Detail & Related papers (2022-10-12T11:04:22Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - Robust Sparse Mean Estimation via Sum of Squares [42.526664955704746]
We study the problem of high-dimensional sparse mean estimation in the presence of an $epsilon$-fraction of adversarial outliers.
Our algorithms follow the Sum-of-Squares based, to algorithms approach.
arXiv Detail & Related papers (2022-06-07T16:49:54Z) - Sample Complexity Bounds for Robustly Learning Decision Lists against
Evasion Attacks [25.832511407411637]
A fundamental problem in adversarial machine learning is to quantify how much training data is needed in the presence of evasion attacks.
We work with probability distributions on the input data that satisfy a Lipschitz condition: nearby points have similar probability.
For every fixed $k$ the class of $k$-decision lists has sample complexity against a $log(n)$-bounded adversary.
arXiv Detail & Related papers (2022-05-12T14:40:18Z) - Sampling a Near Neighbor in High Dimensions -- Who is the Fairest of
Them All? [17.514226416388475]
We study the $r$-NN problem in the light of individual fairness and providing equal opportunities.
All points that are within distance $r$ from the query should have the same probability to be returned.
We propose several efficient data structures for the exact and approximate variants of the fair NN problem.
arXiv Detail & Related papers (2021-01-26T16:13:07Z) - List-Decodable Mean Estimation in Nearly-PCA Time [50.79691056481693]
We study the fundamental task of list-decodable mean estimation in high dimensions.
Our algorithm runs in time $widetildeO(ndk)$ for all $k = O(sqrtd) cup Omega(d)$, where $n$ is the size of the dataset.
A variant of our algorithm has runtime $widetildeO(ndk)$ for all $k$, at the expense of an $O(sqrtlog k)$ factor in the recovery guarantee
arXiv Detail & Related papers (2020-11-19T17:21:37Z) - FANOK: Knockoffs in Linear Time [73.5154025911318]
We describe a series of algorithms that efficiently implement Gaussian model-X knockoffs to control the false discovery rate on large scale feature selection problems.
We test our methods on problems with $p$ as large as $500,000$.
arXiv Detail & Related papers (2020-06-15T21:55:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.