From learnable objects to learnable random objects
- URL: http://arxiv.org/abs/2504.00847v2
- Date: Mon, 26 May 2025 19:20:48 GMT
- Title: From learnable objects to learnable random objects
- Authors: Aaron Anderson, Michael Benedikt,
- Abstract summary: We consider the relationship between learnability of a "base class" of functions on a set $X$, and learnability of a class of statistical functions derived from the base class.<n>For learning, we establish improved bounds on the sample complexity of learning for statistical classes, stated in terms of dimensions of the base class.
- Score: 0.6906005491572398
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the relationship between learnability of a "base class" of functions on a set $X$, and learnability of a class of statistical functions derived from the base class. For example, we refine results showing that learnability of a family $h_p: p \in Y$ of functions implies learnability of the family of functions $h_\mu=\lambda p: Y. E_\mu(h_p)$, where $E_\mu$ is the expectation with respect to $\mu$, and $\mu$ ranges over probability distributions on $X$. We will look at both Probably Approximately Correct (PAC) learning, where example inputs and outputs are chosen at random, and online learning, where the examples are chosen adversarily. For agnostic learning, we establish improved bounds on the sample complexity of learning for statistical classes, stated in terms of combinatorial dimensions of the base class. We connect these problems to techniques introduced in model theory for "randomizing a structure". We also provide counterexamples for realizable learning, in both the PAC and online settings.
Related papers
- Robust Learning of Multi-index Models via Iterative Subspace Approximation [36.138661719725626]
We study the task of learning Multi-Index Models (MIMs) with label noise under the Gaussian distribution.<n>We focus on well-behaved MIMs with finite ranges that satisfy certain regularity properties.<n>We show that in the presence of random classification noise, the complexity of our algorithm scales agnosticly with $1/epsilon$.
arXiv Detail & Related papers (2025-02-13T17:37:42Z) - On the Power of Interactive Proofs for Learning [3.785008536475385]
We continue the study of doubly-efficient proof systems for verifying PAC learning.
We construct an interactive protocol for learning the largest Fourier characters of a given function $f colon 0,1n to 0,1$ up to an arbitrarily small error.
We show that if we do not insist on doubly-efficient proof systems, then the model becomes trivial.
arXiv Detail & Related papers (2024-04-11T23:16:21Z) - Agnostically Learning Multi-index Models with Queries [54.290489524576756]
We study the power of query access for the task of agnostic learning under the Gaussian distribution.
We show that query access gives significant runtime improvements over random examples for agnostically learning MIMs.
arXiv Detail & Related papers (2023-12-27T15:50:47Z) - Sample Efficient Reinforcement Learning with Partial Dynamics Knowledge [0.704590071265998]
We study the sample complexity of online Q-learning methods when some prior knowledge about the dynamics is available or can be learned efficiently.
We present an optimistic Q-learning algorithm that achieves $tildemathcalO(textPoly(H)sqrtSAT)$ regret under perfect knowledge of $f$.
arXiv Detail & Related papers (2023-12-19T19:53:58Z) - When is Agnostic Reinforcement Learning Statistically Tractable? [76.1408672715773]
A new complexity measure, called the emphspanning capacity, depends solely on the set $Pi$ and is independent of the MDP dynamics.
We show there exists a policy class $Pi$ with a bounded spanning capacity that requires a superpolynomial number of samples to learn.
This reveals a surprising separation for learnability between generative access and online access models.
arXiv Detail & Related papers (2023-10-09T19:40:54Z) - Neural Feature Learning in Function Space [5.807950618412389]
We present a novel framework for learning system design with neural feature extractors.
We introduce the feature geometry, which unifies statistical dependence and feature representations in a function space equipped with inner products.
We propose a nesting technique, which provides systematic algorithm designs for learning the optimal features from data samples.
arXiv Detail & Related papers (2023-09-18T20:39:12Z) - Non-Asymptotic Performance of Social Machine Learning Under Limited Data [45.48644055449902]
This paper studies the probability of error associated with the social machine learning framework.
It addresses the problem of classifying a stream of unlabeled data in a distributed manner.
arXiv Detail & Related papers (2023-06-15T17:42:14Z) - On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features.
Based on these observations, we propose a conceptual framework for feature learning.
Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z) - An Entropy-Based Model for Hierarchical Learning [3.1473798197405944]
A common feature among real-world datasets is that data domains are multiscale.
We propose a learning model that exploits this multiscale data structure.
The hierarchical learning model is inspired by the logical and progressive easy-to-hard learning mechanism of human beings.
arXiv Detail & Related papers (2022-12-30T13:14:46Z) - Equivariance with Learned Canonicalization Functions [77.32483958400282]
We show that learning a small neural network to perform canonicalization is better than using predefineds.
Our experiments show that learning the canonicalization function is competitive with existing techniques for learning equivariant functions across many tasks.
arXiv Detail & Related papers (2022-11-11T21:58:15Z) - Learning versus Refutation in Noninteractive Local Differential Privacy [133.80204506727526]
We study two basic statistical tasks in non-interactive local differential privacy (LDP): learning and refutation.
Our main result is a complete characterization of the sample complexity of PAC learning for non-interactive LDP protocols.
arXiv Detail & Related papers (2022-10-26T03:19:24Z) - Offline Reinforcement Learning with Differentiable Function
Approximation is Provably Efficient [65.08966446962845]
offline reinforcement learning, which aims at optimizing decision-making strategies with historical data, has been extensively applied in real-life applications.
We take a step by considering offline reinforcement learning with differentiable function class approximation (DFA)
Most importantly, we show offline differentiable function approximation is provably efficient by analyzing the pessimistic fitted Q-learning algorithm.
arXiv Detail & Related papers (2022-10-03T07:59:42Z) - What Can Transformers Learn In-Context? A Case Study of Simple Function
Classes [67.06980111346245]
In-context learning refers to the ability of a model to condition on a prompt sequence consisting of in-context examples.
We show that standard Transformers can be trained from scratch to perform in-context learning of linear functions.
We also show that we can train Transformers to in-context learn more complex function classes with performance that matches or exceeds task-specific learning algorithms.
arXiv Detail & Related papers (2022-08-01T18:01:40Z) - Realizable Learning is All You Need [21.34668631009594]
equivalence of realizable and agnostic learnability is a fundamental phenomenon in learning theory.
We give the first model-independent framework explaining the equivalence of realizable and agnostic learnability.
arXiv Detail & Related papers (2021-11-08T19:00:00Z) - Agnostic learning with unknown utilities [70.14742836006042]
In many real-world problems, the utility of a decision depends on the underlying context $x$ and decision $y$.
We study this as agnostic learning with unknown utilities.
We show that estimating the utilities of only the sampled points$S$ suffices to learn a decision function which generalizes well.
arXiv Detail & Related papers (2021-04-17T08:22:04Z) - When Hardness of Approximation Meets Hardness of Learning [35.39956227364153]
We show a single hardness property that implies both hardness of approximation using linear classes and shallow networks, and hardness of learning using correlation queries and gradient-descent.
This allows us to obtain new results on hardness of approximation and learnability of parity functions, DNF formulas and $AC0$ circuits.
arXiv Detail & Related papers (2020-08-18T17:41:28Z) - Estimating Structural Target Functions using Machine Learning and
Influence Functions [103.47897241856603]
We propose a new framework for statistical machine learning of target functions arising as identifiable functionals from statistical models.
This framework is problem- and model-agnostic and can be used to estimate a broad variety of target parameters of interest in applied statistics.
We put particular focus on so-called coarsening at random/doubly robust problems with partially unobserved information.
arXiv Detail & Related papers (2020-08-14T16:48:29Z) - On the Theory of Transfer Learning: The Importance of Task Diversity [114.656572506859]
We consider $t+1$ tasks parameterized by functions of the form $f_j circ h$ in a general function class $mathcalF circ mathcalH$.
We show that for diverse training tasks the sample complexity needed to learn the shared representation across the first $t$ training tasks scales as $C(mathcalH) + t C(mathcalF)$.
arXiv Detail & Related papers (2020-06-20T20:33:59Z) - An Epistemic Approach to the Formal Specification of Statistical Machine
Learning [1.599072005190786]
We introduce a formal model for supervised learning based on a Kripke model.
We then formalize various notions of the classification performance, robustness, and fairness of statistical classifiers.
arXiv Detail & Related papers (2020-04-27T12:16:45Z) - On the Modularity of Hypernetworks [103.1147622394852]
We show that for a structured target function, the overall number of trainable parameters in a hypernetwork is smaller by orders of magnitude than the number of trainable parameters of a standard neural network and an embedding method.
arXiv Detail & Related papers (2020-02-23T22:51:52Z) - Neural Bayes: A Generic Parameterization Method for Unsupervised
Representation Learning [175.34232468746245]
We introduce a parameterization method called Neural Bayes.
It allows computing statistical quantities that are in general difficult to compute.
We show two independent use cases for this parameterization.
arXiv Detail & Related papers (2020-02-20T22:28:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.