Related papers: Generation through the lens of learning theory

Generation through the lens of learning theory

URL: http://arxiv.org/abs/2410.13714v5
Date: Fri, 27 Dec 2024 13:42:07 GMT
Title: Generation through the lens of learning theory
Authors: Jiaxun Li, Vinod Raman, Ambuj Tewari,
Abstract summary: We study generation through the lens of statistical learning theory.<n>We call "uniform" and "non-uniform" generation, and provide a characterization of which hypothesis classes are uniformly and non-uniformly generatable.
Score: 18.355039522639565
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study generation through the lens of statistical learning theory. First, we abstract and formalize the results of Gold [1967], Angluin [1979], Angluin [1980] and Kleinberg and Mullainathan [2024] in terms of a binary hypothesis class defined over an abstract example space. Then, we extend the notion of "generation" from Kleinberg and Mullainathan [2024] to two new settings, we call "uniform" and "non-uniform" generation, and provide a characterization of which hypothesis classes are uniformly and non-uniformly generatable. As is standard in learning theory, our characterizations are in terms of the finiteness of a new combinatorial dimension termed the Closure dimension. By doing so, we are able to compare generatability with predictability (captured via PAC and online learnability) and show that these two properties of hypothesis classes are incompatible -- there are classes that are generatable but not predictable and vice versa. Finally, we extend our results to capture prompted generation and give a complete characterization of which classes are prompt generatable, generalizing some of the work by Kleinberg and Mullainathan [2024].

Related papers

On Union-Closedness of Language Generation [48.36356615217017]
We investigate language generation in the limit - a model by Kleinberg and Mullainathan and extended by Li, Raman, and Tewari.<n>Our results resolve two open questions of Li et al. by proving finite unions of generatable or non-uniformly generatable classes need not be generatable.<n>Our approach utilizes carefully constructed classes along with a novel diagonalization argument that could be of independent interest in the growing area of language generation.
arXiv Detail & Related papers (2025-06-23T13:42:25Z)
Representative Language Generation [4.601683217376771]
"representative generation" is extended to address diversity and bias concerns in generative models.<n>We demonstrate feasibility for countably infinite hypothesis classes and collections of groups under certain conditions.<n>Our findings provide a rigorous foundation for developing more diverse and representative generative models.
arXiv Detail & Related papers (2025-05-27T23:02:54Z)
I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data? [79.01538178959726]
Large language models (LLMs) have led many to conclude that they exhibit a form of intelligence. We introduce a novel generative model that generates tokens on the basis of human interpretable concepts represented as latent discrete variables.
arXiv Detail & Related papers (2025-03-12T01:21:17Z)
Generation from Noisy Examples [2.3020018305241337]
We show that generatability is largely unaffected by the presence of a finite number of noisy examples. For finite and countable classes we show that generatability is largely unaffected by the presence of a finite number of noisy examples.
arXiv Detail & Related papers (2025-01-07T23:16:14Z)
No Free Lunch: Fundamental Limits of Learning Non-Hallucinating Generative Models [14.535583931446807]
We develop a theoretical framework to analyze the learnability of non-hallucinating generative models. We show that incorporating inductive biases aligned with the actual facts into the learning process is essential.
arXiv Detail & Related papers (2024-10-24T23:57:11Z)
Ramsey Theorems for Trees and a General 'Private Learning Implies Online Learning' Theorem [26.292576184028924]
This work continues to investigate the link between differentially private (DP) and online learning. We show that for general classification tasks, DP learnability implies online learnability.
arXiv Detail & Related papers (2024-07-10T15:43:30Z)
Learning Discrete Concepts in Latent Hierarchical Models [73.01229236386148]
Learning concepts from natural high-dimensional data holds potential in building human-aligned and interpretable machine learning models. We formalize concepts as discrete latent causal variables that are related via a hierarchical causal model. We substantiate our theoretical claims with synthetic data experiments.
arXiv Detail & Related papers (2024-06-01T18:01:03Z)
Class-wise Activation Unravelling the Engima of Deep Double Descent [0.0]
Double descent presents a counter-intuitive aspect within the machine learning domain. In this study, we revisited the phenomenon of double descent and discussed the conditions of its occurrence.
arXiv Detail & Related papers (2024-05-13T12:07:48Z)
A Geometric Notion of Causal Probing [91.14470073637236]
In a language model's representation space, all information about a concept such as verbal number is encoded in a linear subspace. We give a set of intrinsic criteria which characterize an ideal linear concept subspace. We find that LEACE returns a one-dimensional subspace containing roughly half of total concept information.
arXiv Detail & Related papers (2023-07-27T17:57:57Z)
On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features. Based on these observations, we propose a conceptual framework for feature learning. Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z)
A Study of Neural Collapse Phenomenon: Grassmannian Frame, Symmetry and Generalization [91.95109845914267]
We extend original Neural Collapse Phenomenon by proving Generalized Neural Collapse hypothesis. We obtain Grassmannian Frame structure from the optimization and generalization of classification. We provide a theorem to explain Symmetric Generalization of permutation.
arXiv Detail & Related papers (2023-04-18T11:35:14Z)
Comparative Learning: A Sample Complexity Theory for Two Hypothesis Classes [5.194264506657145]
We introduce comparative learning as a combination of the realizable and agnostic settings in PAC learning. Even when both $S$ and $B$ have infinite VC dimensions, comparative learning can still have a small sample complexity. We show that the sample complexity of comparative learning is characterized by the mutual VC dimension $mathsfVC(S,B)$.
arXiv Detail & Related papers (2022-11-16T18:38:24Z)
Realizable Learning is All You Need [21.34668631009594]
equivalence of realizable and agnostic learnability is a fundamental phenomenon in learning theory. We give the first model-independent framework explaining the equivalence of realizable and agnostic learnability.
arXiv Detail & Related papers (2021-11-08T19:00:00Z)
Sample-efficient proper PAC learning with approximate differential privacy [51.09425023771246]
We prove that the sample complexity of properly learning a class of Littlestone dimension $d$ with approximate differential privacy is $tilde O(d6)$, ignoring privacy and accuracy parameters. This result answers a question of Bun et al. (FOCS 2020) by improving upon their upper bound of $2O(d)$ on the sample complexity.
arXiv Detail & Related papers (2020-12-07T18:17:46Z)
Rethinking Class Relations: Absolute-relative Supervised and Unsupervised Few-shot Learning [157.62595449130973]
We study the fundamental problem of simplistic class modeling in current few-shot learning methods. We propose a novel Absolute-relative Learning paradigm to fully take advantage of label information to refine the image representations.
arXiv Detail & Related papers (2020-01-12T12:25:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.