When Hardness of Approximation Meets Hardness of Learning
- URL: http://arxiv.org/abs/2008.08059v2
- Date: Sun, 23 Aug 2020 13:46:10 GMT
- Title: When Hardness of Approximation Meets Hardness of Learning
- Authors: Eran Malach, Shai Shalev-Shwartz
- Abstract summary: We show a single hardness property that implies both hardness of approximation using linear classes and shallow networks, and hardness of learning using correlation queries and gradient-descent.
This allows us to obtain new results on hardness of approximation and learnability of parity functions, DNF formulas and $AC0$ circuits.
- Score: 35.39956227364153
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A supervised learning algorithm has access to a distribution of labeled
examples, and needs to return a function (hypothesis) that correctly labels the
examples. The hypothesis of the learner is taken from some fixed class of
functions (e.g., linear classifiers, neural networks etc.). A failure of the
learning algorithm can occur due to two possible reasons: wrong choice of
hypothesis class (hardness of approximation), or failure to find the best
function within the hypothesis class (hardness of learning). Although both
approximation and learnability are important for the success of the algorithm,
they are typically studied separately. In this work, we show a single hardness
property that implies both hardness of approximation using linear classes and
shallow networks, and hardness of learning using correlation queries and
gradient-descent. This allows us to obtain new results on hardness of
approximation and learnability of parity functions, DNF formulas and $AC^0$
circuits.
Related papers
- Hardness of Learning Fixed Parities with Neural Networks [33.84710024813985]
Learning parity functions is a canonical problem in learning theory.
For any fixed parity of some minimal size, using it as a target function to train one-hidden-layer ReLU networks will fail to produce anything meaningful.
arXiv Detail & Related papers (2025-01-01T12:04:06Z) - Robustly Learning Single-Index Models via Alignment Sharpness [40.886706402941435]
We study the problem of learning Single-Index Models under the $L2$ loss in the agnostic model.
We give an efficient learning algorithm, achieving a constant factor approximation to the optimal loss.
arXiv Detail & Related papers (2024-02-27T18:48:07Z) - Learnability, Sample Complexity, and Hypothesis Class Complexity for
Regression Models [10.66048003460524]
This work is inspired by the foundation of PAC and is motivated by the existing regression learning issues.
The proposed approach, denoted by epsilon-Confidence Approximately Correct (epsilon CoAC), utilizes Kullback Leibler divergence (relative entropy)
It enables the learner to compare hypothesis classes of different complexity orders and choose among them the optimum with the minimum epsilon.
arXiv Detail & Related papers (2023-03-28T15:59:12Z) - Offline Reinforcement Learning with Differentiable Function
Approximation is Provably Efficient [65.08966446962845]
offline reinforcement learning, which aims at optimizing decision-making strategies with historical data, has been extensively applied in real-life applications.
We take a step by considering offline reinforcement learning with differentiable function class approximation (DFA)
Most importantly, we show offline differentiable function approximation is provably efficient by analyzing the pessimistic fitted Q-learning algorithm.
arXiv Detail & Related papers (2022-10-03T07:59:42Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise
Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances.
Online descent (OGD) is a popular approach to handle streaming data in pairwise learning.
In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z) - A Boosting Approach to Reinforcement Learning [59.46285581748018]
We study efficient algorithms for reinforcement learning in decision processes whose complexity is independent of the number of states.
We give an efficient algorithm that is capable of improving the accuracy of such weak learning methods.
arXiv Detail & Related papers (2021-08-22T16:00:45Z) - A Theory of Universal Learning [26.51949485387526]
We show that there are only three possible rates of universal learning.
We show that the learning curves of any given concept class decay either at an exponential, or arbitrarily slow rates.
arXiv Detail & Related papers (2020-11-09T15:10:32Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.