Related papers: On the Rademacher Complexity of Linear Hypothesis Sets

On the Rademacher Complexity of Linear Hypothesis Sets

URL: http://arxiv.org/abs/2007.11045v1
Date: Tue, 21 Jul 2020 19:08:21 GMT
Title: On the Rademacher Complexity of Linear Hypothesis Sets
Authors: Pranjal Awasthi, Natalie Frank, Mehryar Mohri
Abstract summary: We present a tight analysis of the empirical Rademacher complexity of the family of linear hypothesis classes with weight vectors bounded in $ell_p$-norm for any $p geq 1$. This provides a tight analysis of generalization using these hypothesis sets and helps derive sharp data-dependent learning guarantees.
Score: 45.06091849856641
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Linear predictors form a rich class of hypotheses used in a variety of learning algorithms. We present a tight analysis of the empirical Rademacher complexity of the family of linear hypothesis classes with weight vectors bounded in $\ell_p$-norm for any $p \geq 1$. This provides a tight analysis of generalization using these hypothesis sets and helps derive sharp data-dependent learning guarantees. We give both upper and lower bounds on the Rademacher complexity of these families and show that our bounds improve upon or match existing bounds, which are known only for $1 \leq p \leq 2$.

Related papers

Do we really need the Rademacher complexities? [3.683202928838613]
We study the fundamental problem of learning with respect to the squared loss in a convex class. We prove that, contrary to prevailing belief and under minimal assumptions, the sample complexity is not governed by the Rademacher complexities.
arXiv Detail & Related papers (2025-02-21T00:40:22Z)
Methods with Local Steps and Random Reshuffling for Generally Smooth Non-Convex Federated Optimization [52.61737731453222]
Non-Machine Learning problems typically do not adhere to the standard smoothness assumption. We propose and analyze new methods with local steps, partial participation of clients, and Random Random Reshuffling. Our theory is consistent with the known results for standard smooth problems.
arXiv Detail & Related papers (2024-12-03T19:20:56Z)
Improved Regret of Linear Ensemble Sampling [9.410437324336275]
We prove that with an ensemble size logarithmic in $T$, linear ensemble sampling can achieve a frequentist regret bound of $tildemathcalO(d3/2sqrtT)$. Our contributions advance the theoretical foundation of ensemble sampling, bringing its regret bounds in line with the best known bounds for other randomized exploration algorithms.
arXiv Detail & Related papers (2024-11-06T14:09:11Z)
Prediction, Learning, Uniform Convergence, and Scale-sensitive Dimensions [39.97534972432276]
We present a new general-purpose algorithm for learning classes of $[0,1]$-valued functions. We prove a general upper bound on the expected absolute error of this algorithm. We show how to apply both packing bounds to obtain improved general bounds on the sample complexity of learning.
arXiv Detail & Related papers (2023-04-21T15:51:35Z)
Generalization Analysis for Contrastive Representation Learning [80.89690821916653]
Existing generalization error bounds depend linearly on the number $k$ of negative examples. We establish novel generalization bounds for contrastive learning which do not depend on $k$, up to logarithmic terms.
arXiv Detail & Related papers (2023-02-24T01:03:56Z)
On the Complexity of a Practical Primal-Dual Coordinate Method [63.899427212054995]
We prove complexity bounds for primal-dual algorithm with random and coordinate descent (PURE-CD) It has been shown to obtain good extrapolation for solving bi-max performance problems.
arXiv Detail & Related papers (2022-01-19T16:14:27Z)
Robust Linear Predictions: Analyses of Uniform Concentration, Fast Rates and Model Misspecification [16.0817847880416]
We offer a unified framework that includes a broad variety of linear prediction problems on a Hilbert space. We show that for misspecification level $epsilon$, these estimators achieve an error rate of $O(maxleft|mathcalO|1/2n-1/2, |mathcalI|1/2n-1 right+epsilon)$, matching the best-known rates in literature.
arXiv Detail & Related papers (2022-01-06T08:51:08Z)
Localization, Convexity, and Star Aggregation [0.0]
Offset Rademacher complexities have been shown to imply sharp, linear-dependent upper bounds for the square loss. We show that in the statistical setting, the offset bound can be generalized to any loss satisfying certain uniform convexity.
arXiv Detail & Related papers (2021-05-19T00:47:59Z)
Bilinear Classes: A Structural Framework for Provable Generalization in RL [119.42509700822484]
Bilinear Classes is a new structural framework which permits generalization in reinforcement learning. The framework incorporates nearly all existing models in which a sample complexity is achievable. Our main result provides an RL algorithm which has sample complexity for Bilinear Classes.
arXiv Detail & Related papers (2021-03-19T16:34:20Z)
Relative Deviation Margin Bounds [55.22251993239944]
We give two types of learning bounds, both distribution-dependent and valid for general families, in terms of the Rademacher complexity. We derive distribution-dependent generalization bounds for unbounded loss functions under the assumption of a finite moment.
arXiv Detail & Related papers (2020-06-26T12:37:17Z)
Adversarial Learning Guarantees for Linear Hypotheses and Neural Networks [45.06091849856641]
We give upper and lower bounds for the adversarial empirical Rademacher complexity of linear hypotheses. We extend our analysis to provide Rademacher complexity lower and upper bounds for a single ReLU unit. Finally, we give adversarial Rademacher complexity bounds for feed-forward neural networks with one hidden layer.
arXiv Detail & Related papers (2020-04-28T15:55:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.