Lean Formalization of Generalization Error Bound by Rademacher Complexity
- URL: http://arxiv.org/abs/2503.19605v2
- Date: Tue, 01 Apr 2025 02:26:31 GMT
- Title: Lean Formalization of Generalization Error Bound by Rademacher Complexity
- Authors: Sho Sonoda, Kazumi Kasaura, Yuma Mizuno, Kei Tsukamoto, Naoto Onda,
- Abstract summary: Generalization error quantifies the gap between a learning machine's performance on given training data versus unseen test data.<n>We formalize the generalization error bound using Rademacher complexity in the Lean 4 theorem prover.
- Score: 5.071436325424837
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We formalize the generalization error bound using Rademacher complexity in the Lean 4 theorem prover. Generalization error quantifies the gap between a learning machine's performance on given training data versus unseen test data, and Rademacher complexity serves as an estimate of this error based on the complexity of learning machines, or hypothesis class. Unlike traditional methods such as PAC learning and VC dimension, Rademacher complexity is applicable across diverse machine learning scenarios including deep learning and kernel methods. We formalize key concepts and theorems, including the empirical and population Rademacher complexities, and establish generalization error bounds through formal proofs of McDiarmid's inequality, Hoeffding's lemma, and symmetrization arguments.
Related papers
- Do we really need the Rademacher complexities? [3.683202928838613]
We study the fundamental problem of learning with respect to the squared loss in a convex class.<n>We prove that, contrary to prevailing belief and under minimal assumptions, the sample complexity is not governed by the Rademacher complexities.
arXiv Detail & Related papers (2025-02-21T00:40:22Z) - MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs [80.96119560172224]
MathGAP generates problem statements and chain-of-thought reasoning traces according to specifications about their arithmetic proof structure.<n>Using MathGAP, we find that LLMs show a significant decrease in performance as proofs get deeper and wider.
arXiv Detail & Related papers (2024-10-17T12:48:14Z) - Class-wise Generalization Error: an Information-Theoretic Analysis [22.877440350595222]
We study the class-generalization error, which quantifies the generalization performance of each individual class.
We empirically validate our proposed bounds in different neural networks and show that they accurately capture the complex class-generalization error behavior.
arXiv Detail & Related papers (2024-01-05T17:05:14Z) - Generalization Guarantees via Algorithm-dependent Rademacher Complexity [33.408679482592426]
We propose a measure to control generalization error, which is the empirical Rademacher complexity of an algorithm- and data-dependent hypothesis class.
We obtain novel bounds based on the finite fractal dimension, which (a) extend previous fractal-type bounds from continuous to finite hypothesis classes, and (b) avoid a mutual information term.
arXiv Detail & Related papers (2023-07-04T18:37:38Z) - The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning [80.1018596899899]
We argue that neural network models share this same preference, formalized using Kolmogorov complexity.
Our experiments show that pre-trained and even randomly language models prefer to generate low-complexity sequences.
These observations justify the trend in deep learning of unifying seemingly disparate problems with an increasingly small set of machine learning models.
arXiv Detail & Related papers (2023-04-11T17:22:22Z) - Generalization Analysis for Contrastive Representation Learning [80.89690821916653]
Existing generalization error bounds depend linearly on the number $k$ of negative examples.
We establish novel generalization bounds for contrastive learning which do not depend on $k$, up to logarithmic terms.
arXiv Detail & Related papers (2023-02-24T01:03:56Z) - Partial Counterfactual Identification from Observational and
Experimental Data [83.798237968683]
We develop effective Monte Carlo algorithms to approximate the optimal bounds from an arbitrary combination of observational and experimental data.
Our algorithms are validated extensively on synthetic and real-world datasets.
arXiv Detail & Related papers (2021-10-12T02:21:30Z) - High-dimensional separability for one- and few-shot learning [58.8599521537]
This work is driven by a practical question, corrections of Artificial Intelligence (AI) errors.
Special external devices, correctors, are developed. They should provide quick and non-iterative system fix without modification of a legacy AI system.
New multi-correctors of AI systems are presented and illustrated with examples of predicting errors and learning new classes of objects by a deep convolutional neural network.
arXiv Detail & Related papers (2021-06-28T14:58:14Z) - Parsimonious Inference [0.0]
Parsimonious inference is an information-theoretic formulation of inference over arbitrary architectures.
Our approaches combine efficient encodings with prudent sampling strategies to construct predictive ensembles without cross-validation.
arXiv Detail & Related papers (2021-03-03T04:13:14Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.