Targeted stochastic gradient Markov chain Monte Carlo for hidden Markov models with rare latent states
- URL: http://arxiv.org/abs/1810.13431v3
- Date: Thu, 25 Jul 2024 10:21:32 GMT
- Title: Targeted stochastic gradient Markov chain Monte Carlo for hidden Markov models with rare latent states
- Authors: Rihui Ou, Deborshee Sen, Alexander L Young, David B Dunson,
- Abstract summary: Markov chain Monte Carlo (MCMC) algorithms for hidden Markov models often rely on the forward-backward sampler.
This makes them computationally slow as the length of the time series increases, motivating the development of sub-sampling-based approaches.
We propose a targeted sub-sampling approach that over-samples observations corresponding to rare latent states when calculating the gradient of parameters associated with them.
- Score: 48.705095800341944
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Markov chain Monte Carlo (MCMC) algorithms for hidden Markov models often rely on the forward-backward sampler. This makes them computationally slow as the length of the time series increases, motivating the development of sub-sampling-based approaches. These approximate the full posterior by using small random subsequences of the data at each MCMC iteration within stochastic gradient MCMC. In the presence of imbalanced data resulting from rare latent states, subsequences often exclude rare latent state data, leading to inaccurate inference and prediction/detection of rare events. We propose a targeted sub-sampling (TASS) approach that over-samples observations corresponding to rare latent states when calculating the stochastic gradient of parameters associated with them. TASS uses an initial clustering of the data to construct subsequence weights that reduce the variance in gradient estimation. This leads to improved sampling efficiency, in particular in settings where the rare latent states correspond to extreme observations. We demonstrate substantial gains in predictive and inferential accuracy on real and synthetic examples.
Related papers
- Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - Stochastic Localization via Iterative Posterior Sampling [2.1383136715042417]
We consider a general localization framework and introduce an explicit class of observation processes, associated with flexible denoising schedules.
We provide a complete methodology, $textitStochastic localization via Iterative Posterior Sampling$ (SLIPS), to obtain approximate samples of this dynamics, and as a byproduct, samples from the target distribution.
We illustrate the benefits and applicability of SLIPS on several benchmarks of multi-modal distributions, including mixtures in increasing dimensions, logistic regression and high-dimensional field system from statistical-mechanics.
arXiv Detail & Related papers (2024-02-16T15:28:41Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Imputing Missing Observations with Time Sliced Synthetic Minority
Oversampling Technique [0.3973560285628012]
We present a simple yet novel time series imputation technique with the goal of constructing an irregular time series that is uniform across every sample in a data set.
We fix a grid defined by the midpoints of non-overlapping bins (dubbed "slices") of observation times and ensure that each sample has values for all of the features at that given time.
This allows one to both impute fully missing observations to allow uniform time series classification across the entire data and, in special cases, to impute individually missing features.
arXiv Detail & Related papers (2022-01-14T19:23:24Z) - A fast asynchronous MCMC sampler for sparse Bayesian inference [10.535140830570256]
We propose a very fast approximate Markov Chain Monte Carlo (MCMC) sampling framework that is applicable to a large class of sparse Bayesian inference problems.
We show that in high-dimensional linear regression problems, the Markov chain generated by the proposed algorithm admits an invariant distribution that recovers correctly the main signal.
arXiv Detail & Related papers (2021-08-14T02:20:49Z) - Partially Hidden Markov Chain Linear Autoregressive model: inference and
forecasting [0.0]
Time series subject to change in regime have attracted much interest in domains such as econometry, finance or meteorology.
We present a novel model which addresses the intermediate case: (i) state processes associated to such time series are modelled by Partially Hidden Markov Chains (PHMCs)
We propose a hidden state inference procedure and a forecasting function that take into account the observed states when existing.
arXiv Detail & Related papers (2021-02-24T22:12:05Z) - Finite-Time Analysis of Stochastic Gradient Descent under Markov
Randomness [27.027583559295365]
gradient descent (SGD) is used in reinforcement learning and machine learning.
We show that SGD converges nearly at the same rate with Markovian gradient samples as with independent gradient samples.
arXiv Detail & Related papers (2020-03-24T17:06:40Z) - Efficiently Sampling Functions from Gaussian Process Posteriors [76.94808614373609]
We propose an easy-to-use and general-purpose approach for fast posterior sampling.
We demonstrate how decoupled sample paths accurately represent Gaussian process posteriors at a fraction of the usual cost.
arXiv Detail & Related papers (2020-02-21T14:03:16Z) - Improving Sampling Accuracy of Stochastic Gradient MCMC Methods via
Non-uniform Subsampling of Gradients [54.90670513852325]
We propose a non-uniform subsampling scheme to improve the sampling accuracy.
EWSG is designed so that a non-uniform gradient-MCMC method mimics the statistical behavior of a batch-gradient-MCMC method.
In our practical implementation of EWSG, the non-uniform subsampling is performed efficiently via a Metropolis-Hastings chain on the data index.
arXiv Detail & Related papers (2020-02-20T18:56:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.