The unstable formula theorem revisited via algorithms
- URL: http://arxiv.org/abs/2212.05050v3
- Date: Wed, 02 Jul 2025 22:11:09 GMT
- Title: The unstable formula theorem revisited via algorithms
- Authors: Maryanthe Malliaris, Shay Moran,
- Abstract summary: In response to gaps in existing learning models, we introduce a new statistical learning model, called Probably Eventually Correct'' or PEC.<n>We characterize Littlestone (stable) classes in terms of this model.<n>In order to obtain a characterization of Littlestone classes in terms of frequent definitions, we build an equivalence theorem highlighting what is common to many existing approximation algorithms.
- Score: 18.557423328068122
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper is about the surprising interaction of a foundational result from model theory, about stability of theories, with algorithmic stability in learning. First, in response to gaps in existing learning models, we introduce a new statistical learning model, called ``Probably Eventually Correct'' or PEC. We characterize Littlestone (stable) classes in terms of this model. As a corollary, Littlestone classes have frequent short definitions in a natural statistical sense. In order to obtain a characterization of Littlestone classes in terms of frequent definitions, we build an equivalence theorem highlighting what is common to many existing approximation algorithms, and to the new PEC. This is guided by an analogy to definability of types in model theory, but has its own character. Drawing on these theorems and on other recent work, we present a complete algorithmic analogue of Shelah's celebrated Unstable Formula Theorem, with algorithmic properties taking the place of the infinite.
Related papers
- A Mean-Field Theory of $Θ$-Expectations [2.1756081703276]
We develop a new class of calculus for such non-linear models.<n>Theta-Expectation is shown to be consistent with the axiom of subaddivity.
arXiv Detail & Related papers (2025-07-30T11:08:56Z) - A Theory of $θ$-Expectations [2.1756081703276]
We develop a framework for a class of differential equations where the driver is a pointwise geometry.<n>The system's tractability is predicated on a global existence of a unique and globally globally.<n> Lipschitz maximizer map for the driver function.
arXiv Detail & Related papers (2025-07-27T16:56:01Z) - Notes on solvable models of many-body quantum chaos [15.617052284991203]
We study a class of many body chaotic models related to the Brownian Sachdev-Ye-Kitaev model.
An emergent symmetry maps the quantum dynamics into a classical process.
arXiv Detail & Related papers (2024-08-20T18:24:52Z) - A Unified Theory of Stochastic Proximal Point Methods without Smoothness [52.30944052987393]
Proximal point methods have attracted considerable interest owing to their numerical stability and robustness against imperfect tuning.
This paper presents a comprehensive analysis of a broad range of variations of the proximal point method (SPPM)
arXiv Detail & Related papers (2024-05-24T21:09:19Z) - Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks [50.29356570858905]
We introduce the Dynamical Systems Framework (DSF), which allows a principled investigation of all these architectures in a common representation.<n>We provide principled comparisons between softmax attention and other model classes, discussing the theoretical conditions under which softmax attention can be approximated.<n>This shows the DSF's potential to guide the systematic development of future more efficient and scalable foundation models.
arXiv Detail & Related papers (2024-05-24T17:19:57Z) - Geometric Neural Diffusion Processes [55.891428654434634]
We extend the framework of diffusion models to incorporate a series of geometric priors in infinite-dimension modelling.
We show that with these conditions, the generative functional model admits the same symmetry.
arXiv Detail & Related papers (2023-07-11T16:51:38Z) - Learning with Explanation Constraints [91.23736536228485]
We provide a learning theoretic framework to analyze how explanations can improve the learning of our models.
We demonstrate the benefits of our approach over a large array of synthetic and real-world experiments.
arXiv Detail & Related papers (2023-03-25T15:06:47Z) - Distributed Bayesian Learning of Dynamic States [65.7870637855531]
The proposed algorithm is a distributed Bayesian filtering task for finite-state hidden Markov models.
It can be used for sequential state estimation, as well as for modeling opinion formation over social networks under dynamic environments.
arXiv Detail & Related papers (2022-12-05T19:40:17Z) - Oracle Inequalities for Model Selection in Offline Reinforcement
Learning [105.74139523696284]
We study the problem of model selection in offline RL with value function approximation.
We propose the first model selection algorithm for offline RL that achieves minimax rate-optimal inequalities up to logarithmic factors.
We conclude with several numerical simulations showing it is capable of reliably selecting a good model class.
arXiv Detail & Related papers (2022-11-03T17:32:34Z) - Realizable Learning is All You Need [21.34668631009594]
equivalence of realizable and agnostic learnability is a fundamental phenomenon in learning theory.
We give the first model-independent framework explaining the equivalence of realizable and agnostic learnability.
arXiv Detail & Related papers (2021-11-08T19:00:00Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z) - Agnostic Online Learning and Excellent Sets [18.557423328068122]
We show that $epsilon$-excellent sets exist for any $epsilon frac12$ in $k$-edge stable graphs in the sense of model theory.<n>We also give a version of the dynamic Sauer-Shelah-Perles lemma appropriate to this setting, related to definability of types.
arXiv Detail & Related papers (2021-08-12T07:33:33Z) - Proof of the Contiguity Conjecture and Lognormal Limit for the Symmetric
Perceptron [21.356438315715888]
We consider the symmetric binary perceptron model, a simple model of neural networks.
We establish several conjectures for this model.
Our proof technique relies on a dense counter-part of the small graph conditioning method.
arXiv Detail & Related papers (2021-02-25T18:39:08Z) - Why Adversarial Interaction Creates Non-Homogeneous Patterns: A
Pseudo-Reaction-Diffusion Model for Turing Instability [10.933825676518195]
We observe Turing-like patterns in a system of neurons with adversarial interaction.
We present a pseudo-reaction-diffusion model to explain the mechanism that may underlie these phenomena.
arXiv Detail & Related papers (2020-10-01T16:09:22Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Learning CHARME models with neural networks [1.5362025549031046]
We consider a model called CHARME (Conditional Heteroscedastic Autoregressive Mixture of Experts)
As an application, we develop a learning theory for the NN-based autoregressive functions of the model.
arXiv Detail & Related papers (2020-02-08T21:51:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.