Average-case Complexity of Teaching Convex Polytopes via Halfspace
Queries
- URL: http://arxiv.org/abs/2006.14677v2
- Date: Sun, 25 Oct 2020 23:58:40 GMT
- Title: Average-case Complexity of Teaching Convex Polytopes via Halfspace
Queries
- Authors: Akash Kumar, Adish Singla, Yisong Yue, Yuxin Chen
- Abstract summary: We show that the average-case teaching complexity is $Theta(d)$, which is in sharp contrast to the worst-case teaching complexity of $Theta(n)$.
Our insights allow us to establish a tight bound on the average-case complexity for $phi$-separable dichotomies.
- Score: 55.28642461328172
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We examine the task of locating a target region among those induced by
intersections of $n$ halfspaces in $\mathbb{R}^d$. This generic task connects
to fundamental machine learning problems, such as training a perceptron and
learning a $\phi$-separable dichotomy. We investigate the average teaching
complexity of the task, i.e., the minimal number of samples (halfspace queries)
required by a teacher to help a version-space learner in locating a randomly
selected target. As our main result, we show that the average-case teaching
complexity is $\Theta(d)$, which is in sharp contrast to the worst-case
teaching complexity of $\Theta(n)$. If instead, we consider the average-case
learning complexity, the bounds have a dependency on $n$ as $\Theta(n)$ for
\tt{i.i.d.} queries and $\Theta(d \log(n))$ for actively chosen queries by the
learner. Our proof techniques are based on novel insights from computational
geometry, which allow us to count the number of convex polytopes and faces in a
Euclidean space depending on the arrangement of halfspaces. Our insights allow
us to establish a tight bound on the average-case complexity for
$\phi$-separable dichotomies, which generalizes the known $\mathcal{O}(d)$
bound on the average number of "extreme patterns" in the classical
computational geometry literature (Cover, 1965).
Related papers
- Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Near-Optimal Bounds for Learning Gaussian Halfspaces with Random
Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$.
Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z) - Information-Computation Tradeoffs for Learning Margin Halfspaces with
Random Classification Noise [50.64137465792738]
We study the problem of PAC $gamma$-margin halfspaces with Random Classification Noise.
We establish an information-computation tradeoff suggesting an inherent gap between the sample complexity of the problem and the sample complexity of computationally efficient algorithms.
arXiv Detail & Related papers (2023-06-28T16:33:39Z) - An Approximation Theory for Metric Space-Valued Functions With A View
Towards Deep Learning [25.25903127886586]
We build universal functions approximators of continuous maps between arbitrary Polish metric spaces $mathcalX$ and $mathcalY$.
In particular, we show that the required number of Dirac measures is determined by the structure of $mathcalX$ and $mathcalY$.
arXiv Detail & Related papers (2023-04-24T16:18:22Z) - Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals.
Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z) - Statistically Near-Optimal Hypothesis Selection [33.83129262033921]
We derive an optimal $2$-approximation learning strategy for the Hypothesis Selection problem.
This is the first algorithm that simultaneously achieves the best approximation factor and sample complexity.
arXiv Detail & Related papers (2021-08-17T21:11:20Z) - Classification Under Misspecification: Halfspaces, Generalized Linear
Models, and Connections to Evolvability [39.01599245403753]
In particular, we study the problem of learning halfspaces under Massart noise with rate $eta$.
We show any SQ algorithm requires super-polynomially many queries to achieve $mathsfOPT + epsilon$.
We also study our algorithm for learning halfspaces under Massart noise empirically and find that it exhibits some appealing fairness properties.
arXiv Detail & Related papers (2020-06-08T17:59:11Z) - Learning Polynomials of Few Relevant Dimensions [12.122488725065388]
Polynomial regression is a basic primitive in learning and statistics.
We give an algorithm that learns the runtime within accuracy $epsilon$ with sample complexity that is roughly $N = O_r,d(n log2(1/epsilon) (log n)d)$ and $O_r,d(N n2)$.
arXiv Detail & Related papers (2020-04-28T18:00:18Z) - Private Learning of Halfspaces: Simplifying the Construction and
Reducing the Sample Complexity [63.29100726064574]
We present a differentially private learner for halfspaces over a finite grid $G$ in $mathbbRd$ with sample complexity $approx d2.5cdot 2log*|G|$.
The building block for our learner is a new differentially private algorithm for approximately solving the linear feasibility problem.
arXiv Detail & Related papers (2020-04-16T16:12:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.