Recent Advances in Algebraic Geometry and Bayesian Statistics
- URL: http://arxiv.org/abs/2211.10049v1
- Date: Fri, 18 Nov 2022 06:19:05 GMT
- Title: Recent Advances in Algebraic Geometry and Bayesian Statistics
- Authors: Sumio Watanabe
- Abstract summary: This article is a review of theoretical advances in the research field of algebraic geometry and Bayesian statistics.
Two mathematical solutions and three applications to statistics based on algebraic geometry reported in this article are now being used in many practical fields in data science and artificial intelligence.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This article is a review of theoretical advances in the research field of
algebraic geometry and Bayesian statistics in the last two decades. Many
statistical models and learning machines which contain hierarchical structures
or latent variables are called nonidentifiable, because the map from a
parameter to a statistical model is not one-to-one. In nonidentifiable models,
both the likelihood function and the posterior distribution have singularities
in general, hence it was difficult to analyze their statistical properties.
However, from the end of the 20th century, new theory and methodology based on
algebraic geometry have been established which enables us to investigate such
models and machines in the real world. In this article, the following results
in recent advances are reported. First, we explain the framework of Bayesian
statistics and introduce a new perspective from the birational geometry.
Second, two mathematical solutions are derived based on algebraic geometry. An
appropriate parameter space can be found by a resolution map, which makes the
posterior distribution be normal crossing and the log likelihood ratio function
be well-defined. Third, three applications to statistics are introduced. The
posterior distribution is represented by the renormalized form, the asymptotic
free energy is derived, and the universal formula among the generalization
loss, the cross validation, and the information criterion is established. Two
mathematical solutions and three applications to statistics based on algebraic
geometry reported in this article are now being used in many practical fields
in data science and artificial intelligence.
Related papers
- Tempered Calculus for ML: Application to Hyperbolic Model Embedding [70.61101116794549]
Most mathematical distortions used in ML are fundamentally integral in nature.
In this paper, we unveil a grounded theory and tools which can help improve these distortions to better cope with ML requirements.
We show how to apply it to a problem that has recently gained traction in ML: hyperbolic embeddings with a "cheap" and accurate encoding along the hyperbolic vsean scale.
arXiv Detail & Related papers (2024-02-06T17:21:06Z) - Conformal inference for regression on Riemannian Manifolds [49.7719149179179]
We investigate prediction sets for regression scenarios when the response variable, denoted by $Y$, resides in a manifold, and the covariable, denoted by X, lies in Euclidean space.
We prove the almost sure convergence of the empirical version of these regions on the manifold to their population counterparts.
arXiv Detail & Related papers (2023-10-12T10:56:25Z) - Discovering Interpretable Physical Models using Symbolic Regression and
Discrete Exterior Calculus [55.2480439325792]
We propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models.
DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems.
We prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data.
arXiv Detail & Related papers (2023-10-10T13:23:05Z) - Statistical Properties of the Entropy from Ordinal Patterns [55.551675080361335]
Knowing the joint distribution of the pair Entropy-Statistical Complexity for a large class of time series models would allow statistical tests that are unavailable to date.
We characterize the distribution of the empirical Shannon's Entropy for any model under which the true normalized Entropy is neither zero nor one.
We present a bilateral test that verifies if there is enough evidence to reject the hypothesis that two signals produce ordinal patterns with the same Shannon's Entropy.
arXiv Detail & Related papers (2022-09-15T23:55:58Z) - Statistical exploration of the Manifold Hypothesis [10.389701595098922]
The Manifold Hypothesis asserts that nominally high-dimensional data are in fact concentrated near a low-dimensional manifold, embedded in high-dimensional space.
We show that rich and sometimes intricate manifold structure in data can emerge from a generic and remarkably simple statistical model.
We derive procedures to discover and interpret the geometry of high-dimensional data, and explore hypotheses about the data generating mechanism.
arXiv Detail & Related papers (2022-08-24T17:00:16Z) - Mathematical Theory of Bayesian Statistics for Unknown Information
Source [0.0]
In statistical inference, uncertainty is unknown and all models are wrong.
We show general properties of cross validation, information criteria, and marginal likelihood.
The derived theory holds even if an unknown uncertainty is unrealizable by a statistical morel or even if the posterior distribution cannot be approximated by any normal distribution.
arXiv Detail & Related papers (2022-06-11T23:35:06Z) - A Unifying Framework for Some Directed Distances in Statistics [0.0]
Density-based directed distances -- particularly known as divergences -- are widely used in statistics.
We provide a general framework which covers in particular both the density-based and distribution-function-based divergence approaches.
We deduce new concepts of dependence between random variables, as alternatives to the celebrated mutual information.
arXiv Detail & Related papers (2022-03-02T04:24:13Z) - Nonparametric Functional Analysis of Generalized Linear Models Under
Nonlinear Constraints [0.0]
This article introduces a novel nonparametric methodology for Generalized Linear Models.
It combines the strengths of the binary regression and latent variable formulations for categorical data.
It extends recently published parametric versions of the methodology and generalizes it.
arXiv Detail & Related papers (2021-10-11T04:49:59Z) - Bayesian Quadrature on Riemannian Data Manifolds [79.71142807798284]
A principled way to model nonlinear geometric structure inherent in data is provided.
However, these operations are typically computationally demanding.
In particular, we focus on Bayesian quadrature (BQ) to numerically compute integrals over normal laws.
We show that by leveraging both prior knowledge and an active exploration scheme, BQ significantly reduces the number of required evaluations.
arXiv Detail & Related papers (2021-02-12T17:38:04Z) - Marginal likelihood computation for model selection and hypothesis
testing: an extensive review [66.37504201165159]
This article provides a comprehensive study of the state-of-the-art of the topic.
We highlight limitations, benefits, connections and differences among the different techniques.
Problems and possible solutions with the use of improper priors are also described.
arXiv Detail & Related papers (2020-05-17T18:31:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.