A Simplistic Model of Neural Scaling Laws: Multiperiodic Santa Fe
Processes
- URL: http://arxiv.org/abs/2302.09049v1
- Date: Fri, 17 Feb 2023 18:27:27 GMT
- Title: A Simplistic Model of Neural Scaling Laws: Multiperiodic Santa Fe
Processes
- Authors: {\L}ukasz D\k{e}bowski
- Abstract summary: It was observed that large language models exhibit a power-law decay of cross entropy with respect to the number of parameters and training tokens.
When extrapolated literally, this decay implies that the entropy rate of natural language is zero.
We construct a simple stationary process and its memory-based predictor that exhibit a power-law decay of cross entropy with the vanishing entropy rate.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It was observed that large language models exhibit a power-law decay of cross
entropy with respect to the number of parameters and training tokens. When
extrapolated literally, this decay implies that the entropy rate of natural
language is zero. To understand this phenomenon -- or an artifact -- better, we
construct a simple stationary stochastic process and its memory-based predictor
that exhibit a power-law decay of cross entropy with the vanishing entropy
rate. Our example is based on previously discussed Santa Fe processes, which
decompose a random text into a process of narration and time-independent
knowledge. Previous discussions assumed that narration is a memoryless source
with Zipf's distribution. In this paper, we propose a model of narration that
has the vanishing entropy rate and applies a randomly chosen deterministic
sequence called a multiperiodic sequence. Under a suitable parameterization,
multiperiodic sequences exhibit asymptotic relative frequencies given by Zipf's
law. Remaining agnostic about the value of the entropy rate of natural
language, we discuss relevance of similar constructions for language modeling.
Related papers
- Causal Layering via Conditional Entropy [85.01590667411956]
Causal discovery aims to recover information about an unobserved causal graph from the observable data it generates.
We provide ways to recover layerings of a graph by accessing the data via a conditional entropy oracle.
arXiv Detail & Related papers (2024-01-19T05:18:28Z) - Observational entropic study of Anderson localization [0.0]
We study the behaviour of the observational entropy in the context of localization-delocalization transition for one-dimensional Aubrey-Andr'e model.
For a given coarse-graining, it increases logarithmically with system size in the delocalized phase, and obeys area law in the localized phase.
We also find the increase of the observational entropy followed by the quantum quench, is logarithmic in time in the delocalized phase as well as at the transition point, while in the localized phase it oscillates.
arXiv Detail & Related papers (2022-09-21T11:26:43Z) - On the Convergence of the ELBO to Entropy Sums [3.345575993695074]
We show that the variational lower bound is at all stationary points of learning equal to a sum of entropies.
For a very large class of generative models, the variational lower bound is at all stationary points of learning.
arXiv Detail & Related papers (2022-09-07T11:33:32Z) - R\'{e}nyi entanglement entropy after a quantum quench starting from
insulating states in a free boson system [0.0]
We investigate the time-dependent R'enyi entanglement entropy after a quantum quench.
We calculate the time evolution of the R'enyi entanglement entropy in unprecedentedly large systems.
We discuss possible applications of our findings to the real-time dynamics of noninteracting bosonic systems.
arXiv Detail & Related papers (2022-07-18T02:36:14Z) - Entropy Production and the Role of Correlations in Quantum Brownian
Motion [77.34726150561087]
We perform a study on quantum entropy production, different kinds of correlations, and their interplay in the driven Caldeira-Leggett model of quantum Brownian motion.
arXiv Detail & Related papers (2021-08-05T13:11:05Z) - Aspects of Pseudo Entropy in Field Theories [0.0]
We numerically analyze a class of free scalar field theories and the XY spin model.
This reveals the basic properties of pseudo entropy in many-body systems.
We find that the non-positivity of the difference can be violated only if the initial and final states belong to different quantum phases.
arXiv Detail & Related papers (2021-06-06T13:25:35Z) - Action Redundancy in Reinforcement Learning [54.291331971813364]
We show that transition entropy can be described by two terms; namely, model-dependent transition entropy and action redundancy.
Our results suggest that action redundancy is a fundamental problem in reinforcement learning.
arXiv Detail & Related papers (2021-02-22T19:47:26Z) - Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method.
A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations.
We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z) - Shannon Entropy Rate of Hidden Markov Processes [77.34726150561087]
We show how to calculate entropy rates for hidden Markov chains.
We also show how this method gives the minimal set of infinite predictive features.
A sequel addresses the challenge's second part on structure.
arXiv Detail & Related papers (2020-08-29T00:48:17Z) - Relevant OTOC operators: footprints of the classical dynamics [68.8204255655161]
The OTOC-RE theorem relates the OTOCs summed over a complete base of operators to the second Renyi entropy.
We show that the sum over a small set of relevant operators, is enough in order to obtain a very good approximation for the entropy.
In turn, this provides with an alternative natural indicator of complexity, i.e. the scaling of the number of relevant operators with time.
arXiv Detail & Related papers (2020-07-31T19:23:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.