Related papers: The Polynomial Method is Universal for Distribution-Free Correlational SQ Learning

The Polynomial Method is Universal for Distribution-Free Correlational SQ Learning

URL: http://arxiv.org/abs/2010.11925v3
Date: Thu, 24 Aug 2023 12:58:14 GMT
Title: The Polynomial Method is Universal for Distribution-Free Correlational SQ Learning
Authors: Aravind Gollakota, Sushrut Karmalkar, Adam Klivans
Abstract summary: Generalizing Malach and Shalev-Shwartz (2022) that gave tight correlational SQ (CSQ) lower bounds for learning formulas. Lower bounds on the threshold or approximate degree of any function class directly imply CSQ lower bounds for PAC or DNF learning.
Score: 9.036124518750732
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the problem of distribution-free learning for Boolean function classes in the PAC and agnostic models. Generalizing a beautiful work of Malach and Shalev-Shwartz (2022) that gave tight correlational SQ (CSQ) lower bounds for learning DNF formulas, we give new proofs that lower bounds on the threshold or approximate degree of any function class directly imply CSQ lower bounds for PAC or agnostic learning respectively. While such bounds implicitly follow by combining prior results by Feldman (2008, 2012) and Sherstov (2008, 2011), to our knowledge the precise statements we give had not appeared in this form before. Moreover, our proofs are simple and largely self-contained. These lower bounds match corresponding positive results using upper bounds on the threshold or approximate degree in the SQ model for PAC or agnostic learning, and in this sense these results show that the polynomial method is a universal, best-possible approach for distribution-free CSQ learning.

Related papers

Low degree conjecture implies sharp computational thresholds in stochastic block model [3.7873597471903935]
We investigate implications of the (extended) low-degree conjecture (recently formalized in [MW23] in the context of the symmetric block model) We establish that no-time algorithm can weakly recover community labels below the Kesten-Stigum (KS) threshold. Our results provide evidence of a computational-to-statistical gap in learning the parameters of block models.
arXiv Detail & Related papers (2025-02-20T20:21:03Z)
Stochastic Approximation with Unbounded Markovian Noise: A General-Purpose Theorem [7.443139252028032]
We consider average-reward Reinforcement Learning with unbounded state space and reward function. Recent works studied this problem in the actor-critic framework. We study Temporal Difference (TD) learning with linear function approximation.
arXiv Detail & Related papers (2024-10-29T03:40:53Z)
Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability [71.82666334363174]
We develop a unified framework for lower bound methods in statistical estimation and interactive decision making. We introduce a novel measure, decision dimension, which facilitates the complexity of new lower bounds for interactive decision making.
arXiv Detail & Related papers (2024-10-07T15:14:58Z)
Fine-grained analysis of non-parametric estimation for pairwise learning [9.676007573960383]
We are concerned with the generalization performance of non-parametric estimation for pairwise learning. Our results can be used to handle a wide range of pairwise learning problems including ranking, AUC, pairwise regression and metric and similarity learning.
arXiv Detail & Related papers (2023-05-31T08:13:14Z)
High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance [59.211456992422136]
We propose algorithms with high-probability convergence results under less restrictive assumptions. These results justify the usage of the considered methods for solving problems that do not fit standard functional classes in optimization.
arXiv Detail & Related papers (2023-02-02T10:37:23Z)
Fine-Grained Distribution-Dependent Learning Curves [27.09513298165498]
Learning curves plot the expected error of a learning algorithm as a function of the number of labeled input samples. In this paper we introduce a new dimension characterization called the grained PAC that improves and refines the recent results of Bousquet et al. Our characterization sheds new light on the structure of learning curves by providing fine-grained bounds.
arXiv Detail & Related papers (2022-08-31T03:29:21Z)
Hardness of Noise-Free Learning for Two-Hidden-Layer Neural Networks [21.687992445577226]
We give exponential statistical query (SQ) lower bounds for learning two-hidden-layer ReLU networks with respect to Gaussian inputs in the standard (noise-free) model. Previous SQ lower bounds held only for adversarial noise models (agnostic learning) or restricted models such as correlational SQ. We show how to extend their technique to other learning models and, in many well-studied cases, obtain a more efficient reduction.
arXiv Detail & Related papers (2022-02-10T18:59:14Z)
A Boosting Approach to Reinforcement Learning [59.46285581748018]
We study efficient algorithms for reinforcement learning in decision processes whose complexity is independent of the number of states. We give an efficient algorithm that is capable of improving the accuracy of such weak learning methods.
arXiv Detail & Related papers (2021-08-22T16:00:45Z)
Agnostic Proper Learning of Halfspaces under Gaussian Marginals [56.01192577666607]
We study the problem of agnostically learning halfspaces under the Gaussian. Our main result is the em first proper learning algorithm for this problem.
arXiv Detail & Related papers (2021-02-10T18:40:44Z)
The Optimality of Polynomial Regression for Agnostic Learning under Gaussian Marginals [47.81107898315438]
We develop a method for finding hard families of examples for a wide class of problems by using duality LP. We show that the $L1$-regression is essentially best possible, and therefore that the computational difficulty of learning a concept class is closely related to the degree required to approximate any function from the class in $L1$-norm.
arXiv Detail & Related papers (2021-02-08T18:06:32Z)
Lower bounds in multiple testing: A framework based on derandomized proxies [107.69746750639584]
This paper introduces an analysis strategy based on derandomization, illustrated by applications to various concrete models. We provide numerical simulations of some of these lower bounds, and show a close relation to the actual performance of the Benjamini-Hochberg (BH) algorithm.
arXiv Detail & Related papers (2020-05-07T19:59:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.