Characterizing Online and Private Learnability under Distributional Constraints via Generalized Smoothness
- URL: http://arxiv.org/abs/2602.20585v1
- Date: Tue, 24 Feb 2026 06:15:59 GMT
- Title: Characterizing Online and Private Learnability under Distributional Constraints via Generalized Smoothness
- Authors: Moïse Blanchard, Abhishek Shetty, Alexander Rakhlin,
- Abstract summary: We study sequential decision making under distributional adversaries that can adaptively choose data-generating distributions from a fixed family $U$.<n>We provide a near complete characterization of families $U$ that admit learnability in terms of a notion known as generalized smoothness.<n>We show that the generalized smoothness also characterizes private learnability under distributional constraints.
- Score: 63.833913892018536
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding minimal assumptions that enable learning and generalization is perhaps the central question of learning theory. Several celebrated results in statistical learning theory, such as the VC theorem and Littlestone's characterization of online learnability, establish conditions on the hypothesis class that allow for learning under independent data and adversarial data, respectively. Building upon recent work bridging these extremes, we study sequential decision making under distributional adversaries that can adaptively choose data-generating distributions from a fixed family $U$ and ask when such problems are learnable with sample complexity that behaves like the favorable independent case. We provide a near complete characterization of families $U$ that admit learnability in terms of a notion known as generalized smoothness i.e. a distribution family admits VC-dimension-dependent regret bounds for every finite-VC hypothesis class if and only if it is generalized smooth. Further, we give universal algorithms that achieve low regret under any generalized smooth adversary without explicit knowledge of $U$. Finally, when $U$ is known, we provide refined bounds in terms of a combinatorial parameter, the fragmentation number, that captures how many disjoint regions can carry nontrivial mass under $U$. These results provide a nearly complete understanding of learnability under distributional adversaries. In addition, building upon the surprising connection between online learning and differential privacy, we show that the generalized smoothness also characterizes private learnability under distributional constraints.
Related papers
- Distribution-Free Sequential Prediction with Abstentions [7.110627876507878]
We study a sequential prediction problem in which an adversary is allowed to inject arbitrarily many adversarial instances in a stream of i.i.d. instances.<n>At each round, the learner may also emphabstain from making a prediction without incurring any penalty if the instance was indeed corrupted.<n>We propose an textscAbstainBoost based on a boosting procedure of weak learners, which guarantees sublinear error for general VC classes.
arXiv Detail & Related papers (2026-02-20T00:28:27Z) - Distributionally-Constrained Adversaries in Online Learning [7.903539618132857]
We consider the more general and flexible framework of distributionally constrained adversaries in which instances are drawn from distributions chosen by an adversary.<n>We give a characterization for which distribution classes are learnable in this context against both oblivious and adaptive adversaries.
arXiv Detail & Related papers (2025-06-12T02:11:10Z) - Proper Learnability and the Role of Unlabeled Data [10.168670899305232]
We show that there are problems whose proper learnability is logically undecidable, i.e., independent of the ZFC axioms.<n>We then show all impossibility results which obstruct any characterization of proper learnability in the realizable PAC model.
arXiv Detail & Related papers (2025-02-14T18:41:53Z) - Symmetric Equilibrium Learning of VAEs [56.56929742714685]
We view variational autoencoders (VAEs) as decoder-encoder pairs, which map distributions in the data space to distributions in the latent space and vice versa.
We propose a Nash equilibrium learning approach, which is symmetric with respect to the encoder and decoder and allows learning VAEs in situations where both the data and the latent distributions are accessible only by sampling.
arXiv Detail & Related papers (2023-07-19T10:27:34Z) - The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning [80.1018596899899]
We argue that neural network models share this same preference, formalized using Kolmogorov complexity.
Our experiments show that pre-trained and even randomly language models prefer to generate low-complexity sequences.
These observations justify the trend in deep learning of unifying seemingly disparate problems with an increasingly small set of machine learning models.
arXiv Detail & Related papers (2023-04-11T17:22:22Z) - Generalization Analysis for Contrastive Representation Learning [80.89690821916653]
Existing generalization error bounds depend linearly on the number $k$ of negative examples.
We establish novel generalization bounds for contrastive learning which do not depend on $k$, up to logarithmic terms.
arXiv Detail & Related papers (2023-02-24T01:03:56Z) - Differentially-Private Bayes Consistency [70.92545332158217]
We construct a Bayes consistent learning rule that satisfies differential privacy (DP)
We prove that any VC class can be privately learned in a semi-supervised setting with a near-optimal sample complexity.
arXiv Detail & Related papers (2022-12-08T11:57:30Z) - Learning versus Refutation in Noninteractive Local Differential Privacy [133.80204506727526]
We study two basic statistical tasks in non-interactive local differential privacy (LDP): learning and refutation.
Our main result is a complete characterization of the sample complexity of PAC learning for non-interactive LDP protocols.
arXiv Detail & Related papers (2022-10-26T03:19:24Z) - Relative Deviation Margin Bounds [55.22251993239944]
We give two types of learning bounds, both distribution-dependent and valid for general families, in terms of the Rademacher complexity.
We derive distribution-dependent generalization bounds for unbounded loss functions under the assumption of a finite moment.
arXiv Detail & Related papers (2020-06-26T12:37:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.