Which Algorithms Have Tight Generalization Bounds?
- URL: http://arxiv.org/abs/2410.01969v1
- Date: Wed, 2 Oct 2024 19:24:57 GMT
- Title: Which Algorithms Have Tight Generalization Bounds?
- Authors: Michael Gastpar, Ido Nachum, Jonathan Shafer, Thomas Weinberger,
- Abstract summary: We study which machine learning algorithms have tight generalization bounds.
We show that algorithms that have certain inductive biases that cause them to be unstable do not admit tight generalization bounds.
We conclude with a simple characterization that relates the existence of tight generalization bounds to the conditional variance of the algorithm's loss.
- Score: 13.364505088105508
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We study which machine learning algorithms have tight generalization bounds. First, we present conditions that preclude the existence of tight generalization bounds. Specifically, we show that algorithms that have certain inductive biases that cause them to be unstable do not admit tight generalization bounds. Next, we show that algorithms that are sufficiently stable do have tight generalization bounds. We conclude with a simple characterization that relates the existence of tight generalization bounds to the conditional variance of the algorithm's loss.
Related papers
- Bregman-divergence-based Arimoto-Blahut algorithm [53.64687146666141]
We generalize the Arimoto-Blahut algorithm to a general function defined over Bregman-divergence system.
We propose a convex-optimization-free algorithm that can be applied to classical and quantum rate-distortion theory.
arXiv Detail & Related papers (2024-08-10T06:16:24Z) - Evaluating Genetic Algorithms through the Approximability Hierarchy [55.938644481736446]
In this paper, we analyze the usefulness of using genetic algorithms depending on the approximation class the problem belongs to.
In particular, we use the standard approximability hierarchy, showing that genetic algorithms are especially useful for the most pessimistic classes of the hierarchy.
arXiv Detail & Related papers (2024-02-01T09:18:34Z) - When can you trust feature selection? -- I: A condition-based analysis
of LASSO and generalised hardness of approximation [49.1574468325115]
We show how no (randomised) algorithm can determine the correct support sets (with probability $> 1/2$) of minimisers of LASSO when reading approximate input.
For ill-posed inputs, the algorithm runs forever, hence, it will never produce a wrong answer.
For any algorithm defined on an open set containing a point with infinite condition number, there is an input for which the algorithm will either run forever or produce a wrong answer.
arXiv Detail & Related papers (2023-12-18T18:29:01Z) - Fantastic Generalization Measures are Nowhere to be Found [14.599761709255917]
We study the notion of a generalization bound being uniformly tight, meaning that the difference between the bound and the population loss is small.
Numerous generalization bounds have been proposed in the literature as potential explanations for the ability of neural networks to generalize.
arXiv Detail & Related papers (2023-09-24T14:53:51Z) - Invex Programs: First Order Algorithms and Their Convergence [66.40124280146863]
Invex programs are a special kind of non-constrained problems which attain global minima at every stationary point.
We propose new first-order algorithms to solve general convergence rates in beyondvex problems.
Our proposed algorithm is the first algorithm to solve constrained invex programs.
arXiv Detail & Related papers (2023-07-10T10:11:01Z) - Robust and Space-Efficient Dual Adversary Quantum Query Algorithms [0.0]
General adversary dual is a powerful tool in quantum computing.
It gives a query-optimal bounded-error quantumlemma deciding any Boolean function.
We develop a robust dual adversary algorithm that can handle approximately satisfied constraints.
arXiv Detail & Related papers (2023-06-26T19:59:30Z) - The Adversary Bound Revisited: From Optimal Query Algorithms to Optimal
Control [0.0]
This note complements the paper "One-Way Ticket to Las Vegas and the Quantum Adversary"
I develop the ideas behind the adversary bound - universal algorithm duality therein in a different form, using the same perspective as Barnum-Saks-Szegedy.
arXiv Detail & Related papers (2022-11-29T15:25:45Z) - Tighter Generalization Bounds for Iterative Differentially Private
Learning Algorithms [95.73230376153872]
This paper studies the relationship between generalization and privacy preservation in iterative learning algorithms by two sequential steps.
We prove that $(varepsilon, delta)$-differential privacy implies an on-average generalization bound for multi-Database learning algorithms.
We then investigate how the iterative nature shared by most learning algorithms influence privacy preservation and further generalization.
arXiv Detail & Related papers (2020-07-18T09:12:03Z) - The limits of min-max optimization algorithms: convergence to spurious
non-critical sets [82.74514886461257]
min-max optimization algorithms encounter problems far greater because of the existence of periodic cycles and similar phenomena.
We show that there exist algorithms that do not attract any points of the problem.
We illustrate such challenges in simple to almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost almost
arXiv Detail & Related papers (2020-06-16T10:49:27Z) - Can Implicit Bias Explain Generalization? Stochastic Convex Optimization
as a Case Study [43.586269524826854]
implicit regularization refers to the tendency of the optimization generalization towards a certain structured solution that often generalizes well.
We provide a simple construction that rules out the existence of a emphdistribution-independent implicit regularizer that governs the generalization ability of Gradient Descent (SGD)
We then demonstrate a learning problem that rules out a very general class of emphdistribution-dependent implicit regularizers from explaining generalization, which includes strongly convex regularizers as well as non-degenerate norm-based regularizations.
arXiv Detail & Related papers (2020-03-13T08:40:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.