A Metropolis-Adjusted Langevin Algorithm for Sampling Jeffreys Prior
- URL: http://arxiv.org/abs/2504.06372v2
- Date: Tue, 15 Apr 2025 13:25:02 GMT
- Title: A Metropolis-Adjusted Langevin Algorithm for Sampling Jeffreys Prior
- Authors: Yibo Shi, Braghadeesh Lakshminarayanan, Cristian R. Rojas,
- Abstract summary: Inference and estimation are fundamental aspects of statistics, system identification and machine learning.<n>Jeffreys prior is an appealing uninformative prior because it offers two important benefits.<n>We propose a general sampling scheme using the Metropolis-Adjusted Langevin Algorithm.
- Score: 5.500172106704342
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inference and estimation are fundamental aspects of statistics, system identification and machine learning. For most inference problems, prior knowledge is available on the system to be modeled, and Bayesian analysis is a natural framework to impose such prior information in the form of a prior distribution. However, in many situations, coming out with a fully specified prior distribution is not easy, as prior knowledge might be too vague, so practitioners prefer to use a prior distribution that is as `ignorant' or `uninformative' as possible, in the sense of not imposing subjective beliefs, while still supporting reliable statistical analysis. Jeffreys prior is an appealing uninformative prior because it offers two important benefits: (i) it is invariant under any re-parameterization of the model, (ii) it encodes the intrinsic geometric structure of the parameter space through the Fisher information matrix, which in turn enhances the diversity of parameter samples. Despite these benefits, drawing samples from Jeffreys prior is a challenging task. In this paper, we propose a general sampling scheme using the Metropolis-Adjusted Langevin Algorithm that enables sampling of parameter values from Jeffreys prior, and provide numerical illustrations of our approach through several examples.
Related papers
- BAPE: Learning an Explicit Bayes Classifier for Long-tailed Visual Recognition [78.70453964041718]
Current deep learning algorithms usually solve for the optimal classifier by emphimplicitly estimating the posterior probabilities.<n>This simple methodology has been proven effective for meticulously balanced academic benchmark datasets.<n>However, it is not applicable to the long-tailed data distributions in the real world.<n>This paper presents a novel approach (BAPE) that provides a more precise theoretical estimation of the data distributions.
arXiv Detail & Related papers (2025-06-29T15:12:50Z) - Generalization Guarantees for Representation Learning via Data-Dependent Gaussian Mixture Priors [14.453654853392619]
We establish in-expectation and tail bounds on the generalization error of representation learning type algorithms.
We propose a systematic approach to simultaneously learning a data-dependent Gaussian mixture prior and using it as a regularizer.
arXiv Detail & Related papers (2025-02-21T15:43:31Z) - An Iterative Bayesian Approach for System Identification based on Linear Gaussian Models [70.75865595489691]
We tackle the problem of system identification, where we select inputs, observe the corresponding outputs from the true system, and optimize the parameters of our model to best fit the data.<n>Our approach only requires input-output data from the system and first-order information of the model with respect to the parameters.
arXiv Detail & Related papers (2025-01-28T01:57:51Z) - Unrolled denoising networks provably learn optimal Bayesian inference [54.79172096306631]
We prove the first rigorous learning guarantees for neural networks based on unrolling approximate message passing (AMP)
For compressed sensing, we prove that when trained on data drawn from a product prior, the layers of the network converge to the same denoisers used in Bayes AMP.
arXiv Detail & Related papers (2024-09-19T17:56:16Z) - Tackling the Problem of Distributional Shifts: Correcting Misspecified, High-Dimensional Data-Driven Priors for Inverse Problems [39.58317527488534]
In astrophysical applications, it is often difficult or even impossible to acquire independent and identically distributed samples from the underlying data-generating process of interest.<n>We propose addressing this issue by iteratively updating the population-level distributions by retraining the model with posterior samples from different sets of observations.<n>We show that, starting from a misspecified prior distribution, the updated distribution becomes progressively closer to the underlying population-level distribution.
arXiv Detail & Related papers (2024-07-24T22:39:27Z) - Neural information field filter [0.0]
We introduce neural information field filter, a Bayesian state and parameter estimation method for high-dimensional nonlinear dynamical systems.<n>We parameterize the time evolution state path using the span of a finite linear basis.<n>Design an expressive yet simple linear basis before knowing the true state path is crucial for inference accuracy but challenging.
arXiv Detail & Related papers (2024-07-23T14:18:26Z) - Informed Spectral Normalized Gaussian Processes for Trajectory Prediction [0.0]
We propose a novel regularization-based continual learning method for SNGPs.
Our proposal builds upon well-established methods and requires no rehearsal memory or parameter expansion.
We apply our informed SNGP model to the trajectory prediction problem in autonomous driving by integrating prior drivability knowledge.
arXiv Detail & Related papers (2024-03-18T17:05:24Z) - Variational autoencoder with weighted samples for high-dimensional
non-parametric adaptive importance sampling [0.0]
We extend the existing framework to the case of weighted samples by introducing a new objective function.
In order to add flexibility to the model and to be able to learn multimodal distributions, we consider a learnable prior distribution.
We exploit the proposed procedure in existing adaptive importance sampling algorithms to draw points from a target distribution and to estimate a rare event probability in high dimension.
arXiv Detail & Related papers (2023-10-13T15:40:55Z) - Joint Bayesian Inference of Graphical Structure and Parameters with a
Single Generative Flow Network [59.79008107609297]
We propose in this paper to approximate the joint posterior over the structure of a Bayesian Network.
We use a single GFlowNet whose sampling policy follows a two-phase process.
Since the parameters are included in the posterior distribution, this leaves more flexibility for the local probability models.
arXiv Detail & Related papers (2023-05-30T19:16:44Z) - The Choice of Noninformative Priors for Thompson Sampling in
Multiparameter Bandit Models [56.31310344616837]
Thompson sampling (TS) has been known for its outstanding empirical performance supported by theoretical guarantees across various reward models.
This study explores the impact of selecting noninformative priors, offering insights into the performance of TS when dealing with new models that lack theoretical understanding.
arXiv Detail & Related papers (2023-02-28T08:42:42Z) - Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data.
Main aim of the identified model is to predict new data from previous observations.
We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z) - Far from Asymptopia [0.0]
Inference from limited data requires a notion of measure on parameter space, most explicit in the Bayesian framework as a prior.
Here we demonstrate that Jeffreys prior, the best-known uninformative choice, introduces enormous bias when applied to typical scientific models.
We present results on a principled choice of measure which avoids this issue, leading to unbiased inference in complex models.
arXiv Detail & Related papers (2022-05-06T16:23:12Z) - Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs)
PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors.
We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z) - Variational Inference with NoFAS: Normalizing Flow with Adaptive
Surrogate for Computationally Expensive Models [7.217783736464403]
Use of sampling-based approaches such as Markov chain Monte Carlo may become intractable when each likelihood evaluation is computationally expensive.
New approaches combining variational inference with normalizing flow are characterized by a computational cost that grows only linearly with the dimensionality of the latent variable space.
We propose Normalizing Flow with Adaptive Surrogate (NoFAS), an optimization strategy that alternatively updates the normalizing flow parameters and the weights of a neural network surrogate model.
arXiv Detail & Related papers (2021-08-28T14:31:45Z) - Sparse Bayesian Deep Learning for Dynamic System Identification [14.040914364617418]
This paper proposes a sparse Bayesian treatment of deep neural networks (DNNs) for system identification.
The proposed Bayesian approach offers a principled way to alleviate the challenges by marginal likelihood/model evidence approximation.
The effectiveness of the proposed Bayesian approach is demonstrated on several linear and nonlinear systems identification benchmarks.
arXiv Detail & Related papers (2021-07-27T16:09:48Z) - System identification using Bayesian neural networks with nonparametric
noise models [0.0]
We propose a nonparametric approach for system identification in discrete time nonlinear random dynamical systems.
A Gibbs sampler for posterior inference is proposed and its effectiveness is illustrated in simulated and real time series.
arXiv Detail & Related papers (2021-04-25T09:49:50Z) - Multivariate Deep Evidential Regression [77.34726150561087]
A new approach with uncertainty-aware neural networks shows promise over traditional deterministic methods.
We discuss three issues with a proposed solution to extract aleatoric and epistemic uncertainties from regression-based neural networks.
arXiv Detail & Related papers (2021-04-13T12:20:18Z) - Exploring Complementary Strengths of Invariant and Equivariant
Representations for Few-Shot Learning [96.75889543560497]
In many real-world problems, collecting a large number of labeled samples is infeasible.
Few-shot learning is the dominant approach to address this issue, where the objective is to quickly adapt to novel categories in presence of a limited number of samples.
We propose a novel training mechanism that simultaneously enforces equivariance and invariance to a general set of geometric transformations.
arXiv Detail & Related papers (2021-03-01T21:14:33Z) - Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method.
A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations.
We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z) - Variational Nonlinear System Identification [0.8793721044482611]
This paper considers parameter estimation for nonlinear state-space models, which is an important but challenging problem.
We employ a variational inference (VI) approach, which is a principled method that has deep connections to maximum likelihood estimation.
This VI approach ultimately provides estimates of the model as solutions to an optimisation problem, which is deterministic, tractable and can be solved using standard optimisation tools.
arXiv Detail & Related papers (2020-12-08T05:43:50Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.