Related papers: Average-case Complexity of Teaching Convex Polytopes via Halfspace Queries

Average-case Complexity of Teaching Convex Polytopes via Halfspace Queries

URL: http://arxiv.org/abs/2006.14677v2
Date: Sun, 25 Oct 2020 23:58:40 GMT
Title: Average-case Complexity of Teaching Convex Polytopes via Halfspace Queries
Authors: Akash Kumar, Adish Singla, Yisong Yue, Yuxin Chen
Abstract summary: We show that the average-case teaching complexity is $Theta(d)$, which is in sharp contrast to the worst-case teaching complexity of $Theta(n)$. Our insights allow us to establish a tight bound on the average-case complexity for $phi$-separable dichotomies.
Score: 55.28642461328172
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We examine the task of locating a target region among those induced by intersections of $n$ halfspaces in $\mathbb{R}^d$. This generic task connects to fundamental machine learning problems, such as training a perceptron and learning a $\phi$-separable dichotomy. We investigate the average teaching complexity of the task, i.e., the minimal number of samples (halfspace queries) required by a teacher to help a version-space learner in locating a randomly selected target. As our main result, we show that the average-case teaching complexity is $\Theta(d)$, which is in sharp contrast to the worst-case teaching complexity of $\Theta(n)$. If instead, we consider the average-case learning complexity, the bounds have a dependency on $n$ as $\Theta(n)$ for \tt{i.i.d.} queries and $\Theta(d \log(n))$ for actively chosen queries by the learner. Our proof techniques are based on novel insights from computational geometry, which allow us to count the number of convex polytopes and faces in a Euclidean space depending on the arrangement of halfspaces. Our insights allow us to establish a tight bound on the average-case complexity for $\phi$-separable dichotomies, which generalizes the known $\mathcal{O}(d)$ bound on the average number of "extreme patterns" in the classical computational geometry literature (Cover, 1965).

Related papers

Algorithms and SQ Lower Bounds for Robustly Learning Real-valued Multi-index Models [34.196233651364615]
We study the complexity of learning real-valued Multi-Index Models (MIMs) under the Gaussian distribution.<n>A $K$-MIM is a function $f:mathbbRdto mathbbR$ that depends only on the projection of its input onto a $K$-dimensional subspace.<n>We give a general algorithm for PAC learning a broad class of MIMs with respect to the square loss, even in the presence of adversarial label noise.
arXiv Detail & Related papers (2025-05-27T17:47:26Z)
Robust Learning of Multi-index Models via Iterative Subspace Approximation [36.138661719725626]
We study the task of learning Multi-Index Models (MIMs) with label noise under the Gaussian distribution. We focus on well-behaved MIMs with finite ranges that satisfy certain regularity properties. We show that in the presence of random classification noise, the complexity of our algorithm scales agnosticly with $1/epsilon$.
arXiv Detail & Related papers (2025-02-13T17:37:42Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z)
Near-Optimal Bounds for Learning Gaussian Halfspaces with Random Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$. Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z)
Information-Computation Tradeoffs for Learning Margin Halfspaces with Random Classification Noise [50.64137465792738]
We study the problem of PAC $gamma$-margin halfspaces with Random Classification Noise. We establish an information-computation tradeoff suggesting an inherent gap between the sample complexity of the problem and the sample complexity of computationally efficient algorithms.
arXiv Detail & Related papers (2023-06-28T16:33:39Z)
An Approximation Theory for Metric Space-Valued Functions With A View Towards Deep Learning [25.25903127886586]
We build universal functions approximators of continuous maps between arbitrary Polish metric spaces $mathcalX$ and $mathcalY$. In particular, we show that the required number of Dirac measures is determined by the structure of $mathcalX$ and $mathcalY$.
arXiv Detail & Related papers (2023-04-24T16:18:22Z)
Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals. Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z)
Statistically Near-Optimal Hypothesis Selection [33.83129262033921]
We derive an optimal $2$-approximation learning strategy for the Hypothesis Selection problem. This is the first algorithm that simultaneously achieves the best approximation factor and sample complexity.
arXiv Detail & Related papers (2021-08-17T21:11:20Z)
On the Sample Complexity of Privately Learning Axis-Aligned Rectangles [16.092248433189816]
We revisit the fundamental problem of learning Axis-Aligned-Rectangles over a finite grid. We present a novel algorithm that reduces the sample complexity to only $tildeOleftdcdotleft(log*|X|right)1.5right$.
arXiv Detail & Related papers (2021-07-24T04:06:11Z)
Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Connections to Evolvability [39.01599245403753]
In particular, we study the problem of learning halfspaces under Massart noise with rate $eta$. We show any SQ algorithm requires super-polynomially many queries to achieve $mathsfOPT + epsilon$. We also study our algorithm for learning halfspaces under Massart noise empirically and find that it exhibits some appealing fairness properties.
arXiv Detail & Related papers (2020-06-08T17:59:11Z)
Learning Polynomials of Few Relevant Dimensions [12.122488725065388]
Polynomial regression is a basic primitive in learning and statistics. We give an algorithm that learns the runtime within accuracy $epsilon$ with sample complexity that is roughly $N = O_r,d(n log2(1/epsilon) (log n)d)$ and $O_r,d(N n2)$.
arXiv Detail & Related papers (2020-04-28T18:00:18Z)
Private Learning of Halfspaces: Simplifying the Construction and Reducing the Sample Complexity [63.29100726064574]
We present a differentially private learner for halfspaces over a finite grid $G$ in $mathbbRd$ with sample complexity $approx d2.5cdot 2log*|G|$. The building block for our learner is a new differentially private algorithm for approximately solving the linear feasibility problem.
arXiv Detail & Related papers (2020-04-16T16:12:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.