Bayesian inference via sparse Hamiltonian flows
        - URL: http://arxiv.org/abs/2203.05723v1
- Date: Fri, 11 Mar 2022 02:36:59 GMT
- Title: Bayesian inference via sparse Hamiltonian flows
- Authors: Naitong Chen, Zuheng Xu, Trevor Campbell
- Abstract summary: A Bayesian coreset is a small, weighted subset of data that replaces the full dataset during Bayesian inference.
Current methods tend to be slow, require a secondary inference step after coreset construction, and do not provide bounds on the data marginal evidence.
We introduce a new method -- sparse Hamiltonian flows -- that addresses all three of these challenges.
- Score: 16.393322369105864
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   A Bayesian coreset is a small, weighted subset of data that replaces the full
dataset during Bayesian inference, with the goal of reducing computational
cost. Although past work has shown empirically that there often exists a
coreset with low inferential error, efficiently constructing such a coreset
remains a challenge. Current methods tend to be slow, require a secondary
inference step after coreset construction, and do not provide bounds on the
data marginal evidence. In this work, we introduce a new method -- sparse
Hamiltonian flows -- that addresses all three of these challenges. The method
involves first subsampling the data uniformly, and then optimizing a
Hamiltonian flow parametrized by coreset weights and including periodic
momentum quasi-refreshment steps. Theoretical results show that the method
enables an exponential compression of the dataset in a representative model,
and that the quasi-refreshment steps reduce the KL divergence to the target.
Real and synthetic experiments demonstrate that sparse Hamiltonian flows
provide accurate posterior approximations with significantly reduced runtime
compared with competing dynamical-system-based inference methods.
 
      
        Related papers
        - Symmetric Rank-One Quasi-Newton Methods for Deep Learning Using Cubic   Regularization [0.5120567378386615]
 First-order descent and other first-order variants, such as Adam and AdaGrad, are commonly used in the field of deep learning.<n>However, these methods do not exploit curvature information.<n>Quasi-Newton methods re-use previously computed low Hessian approximations.
 arXiv  Detail & Related papers  (2025-02-17T20:20:11Z)
- Straightness of Rectified Flow: A Theoretical Insight into Wasserstein   Convergence [54.580605276017096]
 Diffusion models have emerged as a powerful tool for image generation and denoising.
Recently, Liu et al. designed a novel alternative generative model Rectified Flow (RF)
RF aims to learn straight flow trajectories from noise to data using a sequence of convex optimization problems.
 arXiv  Detail & Related papers  (2024-10-19T02:36:11Z)
- Conditional Lagrangian Wasserstein Flow for Time Series Imputation [3.914746375834628]
 We propose a novel method for time series imputation called Conditional Lagrangian Wasserstein Flow.
The proposed method leverages the (conditional) optimal transport theory to learn the probability flow in a simulation-free manner.
The experimental results on the real-word datasets show that the proposed method achieves competitive performance on time series imputation.
 arXiv  Detail & Related papers  (2024-10-10T02:46:28Z)
- Bayesian Circular Regression with von Mises Quasi-Processes [57.88921637944379]
 In this work we explore a family of expressive and interpretable distributions over circle-valued random functions.
For posterior inference, we introduce a new Stratonovich-like augmentation that lends itself to fast Gibbs sampling.
We present experiments applying this model to the prediction of wind directions and the percentage of the running gait cycle as a function of joint angles.
 arXiv  Detail & Related papers  (2024-06-19T01:57:21Z)
- An Efficient Rehearsal Scheme for Catastrophic Forgetting Mitigation   during Multi-stage Fine-tuning [55.467047686093025]
 A common approach to alleviate such forgetting is to rehearse samples from prior tasks during fine-tuning.
We propose a sampling scheme, textttbf mix-cd, that prioritizes rehearsal of collateral damage'' samples.
Our approach is computationally efficient, easy to implement, and outperforms several leading continual learning methods in compute-constrained settings.
 arXiv  Detail & Related papers  (2024-02-12T22:32:12Z)
- A Metalearned Neural Circuit for Nonparametric Bayesian Inference [4.767884267554628]
 Most applications of machine learning to classification assume a closed set of balanced classes.
This is at odds with the real world, where class occurrence statistics often follow a long-tailed power-law distribution.
We present a method for extracting the inductive bias from a nonparametric Bayesian model and transferring it to an artificial neural network.
 arXiv  Detail & Related papers  (2023-11-24T16:43:17Z)
- Bayesian Pseudo-Coresets via Contrastive Divergence [5.479797073162603]
 We introduce a novel approach for constructing pseudo-coresets by utilizing contrastive divergence.
It eliminates the need for approximations in the pseudo-coreset construction process.
We conduct extensive experiments on multiple datasets, demonstrating its superiority over existing BPC techniques.
 arXiv  Detail & Related papers  (2023-03-20T17:13:50Z)
- Decomposed Diffusion Sampler for Accelerating Large-Scale Inverse
  Problems [64.29491112653905]
 We propose a novel and efficient diffusion sampling strategy that synergistically combines the diffusion sampling and Krylov subspace methods.
Specifically, we prove that if tangent space at a denoised sample by Tweedie's formula forms a Krylov subspace, then the CG with the denoised data ensures the data consistency update to remain in the tangent space.
Our proposed method achieves more than 80 times faster inference time than the previous state-of-the-art method.
 arXiv  Detail & Related papers  (2023-03-10T07:42:49Z)
- Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
 We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
 arXiv  Detail & Related papers  (2022-11-30T05:33:29Z)
- On Divergence Measures for Bayesian Pseudocoresets [28.840995981326028]
 A Bayesian pseudocoreset is a small synthetic dataset for which the posterior over parameters approximates that of the original dataset.
This paper casts two representative dataset distillation algorithms as approximations to methods for constructing pseudocoresets.
We provide a unifying view of such divergence measures in Bayesian pseudocoreset construction.
 arXiv  Detail & Related papers  (2022-10-12T13:45:36Z)
- Hessian Averaging in Stochastic Newton Methods Achieves Superlinear
  Convergence [69.65563161962245]
 We consider a smooth and strongly convex objective function using a Newton method.
We show that there exists a universal weighted averaging scheme that transitions to local convergence at an optimal stage.
 arXiv  Detail & Related papers  (2022-04-20T07:14:21Z)
- Deep Equilibrium Optical Flow Estimation [80.80992684796566]
 Recent state-of-the-art (SOTA) optical flow models use finite-step recurrent update operations to emulate traditional algorithms.
These RNNs impose large computation and memory overheads, and are not directly trained to model such stable estimation.
We propose deep equilibrium (DEQ) flow estimators, an approach that directly solves for the flow as the infinite-level fixed point of an implicit layer.
 arXiv  Detail & Related papers  (2022-04-18T17:53:44Z)
- Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks [78.76880041670904]
 In neural networks with binary activations and or binary weights the training by gradient descent is complicated.
We propose a new method for this estimation problem combining sampling and analytic approximation steps.
We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models.
 arXiv  Detail & Related papers  (2020-06-04T21:51:21Z)
- Nonparametric Bayesian volatility learning under microstructure noise [2.812395851874055]
 We study the problem of learning the volatility under market microstructure noise.
Specifically, we consider noisy discrete time observations from a differential equation.
We develop a novel computational method to learn the diffusion coefficient of the equation.
 arXiv  Detail & Related papers  (2018-05-15T07:32:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.