Ultimate limit on learning non-Markovian behavior: Fisher information
rate and excess information
- URL: http://arxiv.org/abs/2310.03968v1
- Date: Fri, 6 Oct 2023 01:53:42 GMT
- Title: Ultimate limit on learning non-Markovian behavior: Fisher information
rate and excess information
- Authors: Paul M. Riechers
- Abstract summary: We address the fundamental limits of learning unknown parameters of any process from time-series data.
We discover exact closed-form expressions for how optimal inference scales with observation length.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We address the fundamental limits of learning unknown parameters of any
stochastic process from time-series data, and discover exact closed-form
expressions for how optimal inference scales with observation length. Given a
parametrized class of candidate models, the Fisher information of observed
sequence probabilities lower-bounds the variance in model estimation from
finite data. As sequence-length increases, the minimal variance scales as the
square inverse of the length -- with constant coefficient given by the
information rate. We discover a simple closed-form expression for this
information rate, even in the case of infinite Markov order. We furthermore
obtain the exact analytic lower bound on model variance from the
observation-induced metadynamic among belief states. We discover ephemeral,
exponential, and more general modes of convergence to the asymptotic
information rate. Surprisingly, this myopic information rate converges to the
asymptotic Fisher information rate with exactly the same relaxation timescales
that appear in the myopic entropy rate as it converges to the Shannon entropy
rate for the process. We illustrate these results with a sequence of examples
that highlight qualitatively distinct features of stochastic processes that
shape optimal learning.
Related papers
- Kinetic Interacting Particle Langevin Monte Carlo [0.0]
This paper introduces and analyses interacting underdamped Langevin algorithms, for statistical inference in latent variable models.
We propose a diffusion process that evolves jointly in the space of parameters and latent variables.
We provide two explicit discretisations of this diffusion as practical algorithms to estimate parameters of statistical models.
arXiv Detail & Related papers (2024-07-08T09:52:46Z) - Efficiently Parameterized Neural Metriplectic Systems [21.181859944826595]
The proposed approach scales quadratically in both the size of the state and the rank of the metriplectic data.
Metriplectic systems are learned from data in a way that scales quadratically in both the size of the state and the rank of the metriplectic data.
arXiv Detail & Related papers (2024-05-25T17:14:23Z) - On diffusion-based generative models and their error bounds: The log-concave case with full convergence estimates [5.13323375365494]
We provide theoretical guarantees for the convergence behaviour of diffusion-based generative models under strongly log-concave data.
Our class of functions used for score estimation is made of Lipschitz continuous functions avoiding any Lipschitzness assumption on the score function.
This approach yields the best known convergence rate for our sampling algorithm.
arXiv Detail & Related papers (2023-11-22T18:40:45Z) - Non-Parametric Learning of Stochastic Differential Equations with Non-asymptotic Fast Rates of Convergence [65.63201894457404]
We propose a novel non-parametric learning paradigm for the identification of drift and diffusion coefficients of non-linear differential equations.
The key idea essentially consists of fitting a RKHS-based approximation of the corresponding Fokker-Planck equation to such observations.
arXiv Detail & Related papers (2023-05-24T20:43:47Z) - Information Theory Inspired Pattern Analysis for Time-series Data [60.86880787242563]
We propose a highly generalizable method that uses information theory-based features to identify and learn from patterns in time-series data.
For applications with state transitions, features are developed based on Shannon's entropy of Markov chains, entropy rates of Markov chains, and von Neumann entropy of Markov chains.
The results show the proposed information theory-based features improve the recall rate, F1 score, and accuracy on average by up to 23.01% compared with the baseline models.
arXiv Detail & Related papers (2023-02-22T21:09:35Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Fisher information of correlated stochastic processes [0.0]
We prove two results concerning the estimation of parameters encoded in a memoryful process.
First, we show that for processes with finite Markov order, the Fisher information is always linear in the number of outcomes.
Second, we prove with suitable examples that correlations do not necessarily enhance the metrological precision.
arXiv Detail & Related papers (2022-06-01T12:51:55Z) - Modeling High-Dimensional Data with Unknown Cut Points: A Fusion
Penalized Logistic Threshold Regression [2.520538806201793]
In traditional logistic regression models, the link function is often assumed to be linear and continuous in predictors.
We consider a threshold model that all continuous features are discretized into ordinal levels, which further determine the binary responses.
We find the lasso model is well suited in the problem of early detection and prediction for chronic disease like diabetes.
arXiv Detail & Related papers (2022-02-17T04:16:40Z) - Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method.
A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations.
We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z) - SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for
Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features.
We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.