Related papers: The Sample Complexity of Multi-Distribution Learning for VC Classes

The Sample Complexity of Multi-Distribution Learning for VC Classes

URL: http://arxiv.org/abs/2307.12135v1
Date: Sat, 22 Jul 2023 18:02:53 GMT
Title: The Sample Complexity of Multi-Distribution Learning for VC Classes
Authors: Pranjal Awasthi, Nika Haghtalab, Eric Zhao
Abstract summary: Multi-distribution learning is a generalization of PAC learning to settings with multiple data distributions. There remains a significant gap between the known upper and lower bounds for PAC-learnable classes. We discuss recent progress on this problem and some hurdles that are fundamental to the use of game dynamics in statistical learning.
Score: 25.73730126599202
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-distribution learning is a natural generalization of PAC learning to settings with multiple data distributions. There remains a significant gap between the known upper and lower bounds for PAC-learnable classes. In particular, though we understand the sample complexity of learning a VC dimension d class on $k$ distributions to be $O(\epsilon^{-2} \ln(k)(d + k) + \min\{\epsilon^{-1} dk, \epsilon^{-4} \ln(k) d\})$, the best lower bound is $\Omega(\epsilon^{-2}(d + k \ln(k)))$. We discuss recent progress on this problem and some hurdles that are fundamental to the use of game dynamics in statistical learning.

Related papers

Towards Fundamental Limits for Active Multi-distribution Learning [16.639855803241524]
We develop new algorithms for active multi-distribution learning and establish improved label complexity upper and lower bounds.<n>We show that the bound in the realizable setting is information-theoretically optimal and that the $knu/varepsilon2$ term in the setting is fundamental for proper learners.
arXiv Detail & Related papers (2025-06-21T06:08:58Z)
Algorithms and SQ Lower Bounds for Robustly Learning Real-valued Multi-index Models [34.196233651364615]
We study the complexity of learning real-valued Multi-Index Models (MIMs) under the Gaussian distribution.<n>A $K$-MIM is a function $f:mathbbRdto mathbbR$ that depends only on the projection of its input onto a $K$-dimensional subspace.<n>We give a general algorithm for PAC learning a broad class of MIMs with respect to the square loss, even in the presence of adversarial label noise.
arXiv Detail & Related papers (2025-05-27T17:47:26Z)
Robust Learning of Multi-index Models via Iterative Subspace Approximation [36.138661719725626]
We study the task of learning Multi-Index Models (MIMs) with label noise under the Gaussian distribution. We focus on well-behaved MIMs with finite ranges that satisfy certain regularity properties. We show that in the presence of random classification noise, the complexity of our algorithm scales agnosticly with $1/epsilon$.
arXiv Detail & Related papers (2025-02-13T17:37:42Z)
Fast Rates for Bandit PAC Multiclass Classification [73.17969992976501]
We study multiclass PAC learning with bandit feedback, where inputs are classified into one of $K$ possible labels and feedback is limited to whether or not the predicted labels are correct. Our main contribution is in designing a novel learning algorithm for the agnostic $(varepsilon,delta)$PAC version of the problem.
arXiv Detail & Related papers (2024-06-18T08:54:04Z)
Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
We consider the problem of learning an $varepsilon$-optimal policy in a general class of continuous-space Markov decision processes (MDPs) having smooth Bellman operators. Key to our solution is a novel projection technique based on ideas from harmonic analysis. Our result bridges the gap between two popular but conflicting perspectives on continuous-space MDPs.
arXiv Detail & Related papers (2024-05-10T09:58:47Z)
The sample complexity of multi-distribution learning [17.45683822446751]
We show that an algorithm of sample complexity $widetildeO((d+k)epsilon-2) cdot (k/epsilon)o(1)$ resolves the COLT 2023 open problem of Awasthi, Haghtalab and Zhao.
arXiv Detail & Related papers (2023-12-07T03:53:17Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z)
Near-Optimal Bounds for Learning Gaussian Halfspaces with Random Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$. Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z)
Tight Bounds on the Hardness of Learning Simple Nonparametric Mixtures [9.053430799456587]
We study the problem of learning nonparametric distributions in a finite mixture. We establish tight bounds on the sample complexity for learning the component distributions in such models.
arXiv Detail & Related papers (2022-03-28T23:53:48Z)
Private and polynomial time algorithms for learning Gaussians and beyond [13.947461378608525]
We present a framework for reducing $(varepsilon, delta)$ differentially private (DP) statistical estimation to its non-private counterpart. We provide the first time $(varepsilon, delta)$-DP algorithm for robust learning of (unrestricted) Gaussians.
arXiv Detail & Related papers (2021-11-22T16:25:51Z)
Algorithms and SQ Lower Bounds for PAC Learning One-Hidden-Layer ReLU Networks [48.32532049640782]
We study the problem of learning one-hidden-layer ReLU networks with $k$ hidden units on $mathbbRd$ under Gaussian marginals. For the case of positive coefficients, we give the first-time algorithm for this learning problem for $k$ up to $tildeOOmega(sqrtlog d)$.
arXiv Detail & Related papers (2020-06-22T17:53:54Z)
Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy. For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z)
Proper Learning, Helly Number, and an Optimal SVM Bound [29.35254938542589]
We characterize classes for which the optimal sample complexity can be achieved by a proper learning algorithm. We show that the dual Helly number is bounded if and only if there is a proper joint dependence on $epsilon$ and $delta$.
arXiv Detail & Related papers (2020-05-24T18:11:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.