MESSY Estimation: Maximum-Entropy based Stochastic and Symbolic densitY
Estimation
- URL: http://arxiv.org/abs/2306.04120v2
- Date: Sat, 10 Feb 2024 05:33:21 GMT
- Title: MESSY Estimation: Maximum-Entropy based Stochastic and Symbolic densitY
Estimation
- Authors: Tony Tohme, Mohsen Sadr, Kamal Youcef-Toumi, Nicolas G.
Hadjiconstantinou
- Abstract summary: MESSY estimation is a Maximum-Entropy based Gradient and Symbolic densitY estimation method.
We construct a gradient-based drift-diffusion process that connects samples of the unknown distribution function to a guess symbolic expression.
We find that the addition of a symbolic search for basis functions improves the accuracy of the estimation at a reasonable additional computational cost.
- Score: 4.014524824655106
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce MESSY estimation, a Maximum-Entropy based Stochastic and
Symbolic densitY estimation method. The proposed approach recovers probability
density functions symbolically from samples using moments of a Gradient flow in
which the ansatz serves as the driving force. In particular, we construct a
gradient-based drift-diffusion process that connects samples of the unknown
distribution function to a guess symbolic expression. We then show that when
the guess distribution has the maximum entropy form, the parameters of this
distribution can be found efficiently by solving a linear system of equations
constructed using the moments of the provided samples. Furthermore, we use
Symbolic regression to explore the space of smooth functions and find optimal
basis functions for the exponent of the maximum entropy functional leading to
good conditioning. The cost of the proposed method for each set of selected
basis functions is linear with the number of samples and quadratic with the
number of basis functions. However, the underlying acceptance/rejection
procedure for finding optimal and well-conditioned bases adds to the
computational cost. We validate the proposed MESSY estimation method against
other benchmark methods for the case of a bi-modal and a discontinuous density,
as well as a density at the limit of physical realizability. We find that the
addition of a symbolic search for basis functions improves the accuracy of the
estimation at a reasonable additional computational cost. Our results suggest
that the proposed method outperforms existing density recovery methods in the
limit of a small to moderate number of samples by providing a low-bias and
tractable symbolic description of the unknown density at a reasonable
computational cost.
Related papers
- Maximum a Posteriori Estimation for Linear Structural Dynamics Models Using Bayesian Optimization with Rational Polynomial Chaos Expansions [0.01578888899297715]
We propose an extension to an existing sparse Bayesian learning approach for MAP estimation.
We introduce a Bayesian optimization approach, which allows to adaptively enrich the experimental design.
By combining the sparsity-inducing learning procedure with the experimental design, we effectively reduce the number of model evaluations.
arXiv Detail & Related papers (2024-08-07T06:11:37Z) - Dynamical Measure Transport and Neural PDE Solvers for Sampling [77.38204731939273]
We tackle the task of sampling from a probability density as transporting a tractable density function to the target.
We employ physics-informed neural networks (PINNs) to approximate the respective partial differential equations (PDEs) solutions.
PINNs allow for simulation- and discretization-free optimization and can be trained very efficiently.
arXiv Detail & Related papers (2024-07-10T17:39:50Z) - Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise.
In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z) - Monte Carlo Neural PDE Solver for Learning PDEs via Probabilistic Representation [59.45669299295436]
We propose a Monte Carlo PDE solver for training unsupervised neural solvers.
We use the PDEs' probabilistic representation, which regards macroscopic phenomena as ensembles of random particles.
Our experiments on convection-diffusion, Allen-Cahn, and Navier-Stokes equations demonstrate significant improvements in accuracy and efficiency.
arXiv Detail & Related papers (2023-02-10T08:05:19Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Probability flow solution of the Fokker-Planck equation [10.484851004093919]
We introduce an alternative scheme based on integrating an ordinary differential equation that describes the flow of probability.
Unlike the dynamics, this equation deterministically pushes samples from the initial density onto samples from the solution at any later time.
Our approach is based on recent advances in score-based diffusion for generative modeling.
arXiv Detail & Related papers (2022-06-09T17:37:09Z) - A Non-Classical Parameterization for Density Estimation Using Sample
Moments [0.0]
We propose a non-classical parametrization for density estimation using sample moments.
The proposed estimator is the first one in the literature for which the power moments up to an arbitrary even order exactly match the sample moments.
arXiv Detail & Related papers (2022-01-13T04:28:52Z) - Sensing Cox Processes via Posterior Sampling and Positive Bases [56.82162768921196]
We study adaptive sensing of point processes, a widely used model from spatial statistics.
We model the intensity function as a sample from a truncated Gaussian process, represented in a specially constructed positive basis.
Our adaptive sensing algorithms use Langevin dynamics and are based on posterior sampling (textscCox-Thompson) and top-two posterior sampling (textscTop2) principles.
arXiv Detail & Related papers (2021-10-21T14:47:06Z) - Manifold learning-based polynomial chaos expansions for high-dimensional
surrogate models [0.0]
We introduce a manifold learning-based method for uncertainty quantification (UQ) in describing systems.
The proposed method is able to achieve highly accurate approximations which ultimately lead to the significant acceleration of UQ tasks.
arXiv Detail & Related papers (2021-07-21T00:24:15Z) - Conditional Density Estimation via Weighted Logistic Regressions [0.30458514384586394]
We propose a novel parametric conditional density estimation method by showing the connection between the general density and the likelihood function of inhomogeneous process models.
The maximum likelihood estimates can be obtained via weighted logistic regressions, and the computation can be significantly relaxed by combining a block-wise alternating scheme and local case-control sampling.
arXiv Detail & Related papers (2020-10-21T11:08:25Z) - Fast approximations in the homogeneous Ising model for use in scene
analysis [61.0951285821105]
We provide accurate approximations that make it possible to numerically calculate quantities needed in inference.
We show that our approximation formulae are scalable and unfazed by the size of the Markov Random Field.
The practical import of our approximation formulae is illustrated in performing Bayesian inference in a functional Magnetic Resonance Imaging activation detection experiment, and also in likelihood ratio testing for anisotropy in the spatial patterns of yearly increases in pistachio tree yields.
arXiv Detail & Related papers (2017-12-06T14:24:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.