Change-point Detection and Segmentation of Discrete Data using Bayesian
Context Trees
- URL: http://arxiv.org/abs/2203.04341v1
- Date: Tue, 8 Mar 2022 19:03:21 GMT
- Title: Change-point Detection and Segmentation of Discrete Data using Bayesian
Context Trees
- Authors: Valentinian Lungu, Ioannis Papageorgiou, Ioannis Kontoyiannis
- Abstract summary: Building on the recently introduced Bayesian Context Trees (BCT) framework, the distributions of different segments in a discrete time series are described as variable-memory Markov chains.
Inference for the presence and location of change-points is then performed via Markov chain Monte Carlo sampling.
Results on both simulated and real-world data indicate that the proposed methodology performs better than or as well as state-of-the-art techniques.
- Score: 7.090165638014331
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A new Bayesian modelling framework is introduced for piece-wise homogeneous
variable-memory Markov chains, along with a collection of effective algorithmic
tools for change-point detection and segmentation of discrete time series.
Building on the recently introduced Bayesian Context Trees (BCT) framework, the
distributions of different segments in a discrete time series are described as
variable-memory Markov chains. Inference for the presence and location of
change-points is then performed via Markov chain Monte Carlo sampling. The key
observation that facilitates effective sampling is that, using one of the BCT
algorithms, the prior predictive likelihood of the data can be computed
exactly, integrating out all the models and parameters in each segment. This
makes it possible to sample directly from the posterior distribution of the
number and location of the change-points, leading to accurate estimates and
providing a natural quantitative measure of uncertainty in the results.
Estimates of the actual model in each segment can also be obtained, at
essentially no additional computational cost. Results on both simulated and
real-world data indicate that the proposed methodology performs better than or
as well as state-of-the-art techniques.
Related papers
- Amortised Inference in Neural Networks for Small-Scale Probabilistic
Meta-Learning [41.85464593920907]
A global inducing point variational approximation for BNNs is based on using a set of inducing inputs to construct a series of conditional distributions.
Our key insight is that these inducing inputs can be replaced by the actual data, such that the variational distribution consists of a set of approximate likelihoods for each datapoint.
By training this inference network across related datasets, we can meta-learn Bayesian inference over task-specific BNNs.
arXiv Detail & Related papers (2023-10-24T12:34:25Z) - Delta-AI: Local objectives for amortized inference in sparse graphical models [64.5938437823851]
We present a new algorithm for amortized inference in sparse probabilistic graphical models (PGMs)
Our approach is based on the observation that when the sampling of variables in a PGM is seen as a sequence of actions taken by an agent, sparsity of the PGM enables local credit assignment in the agent's policy learning objective.
We illustrate $Delta$-AI's effectiveness for sampling from synthetic PGMs and training latent variable models with sparse factor structure.
arXiv Detail & Related papers (2023-10-03T20:37:03Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - SwISS: A Scalable Markov chain Monte Carlo Divide-and-Conquer Strategy [1.6114012813668934]
Divide-and-conquer strategies for Monte Carlo algorithms are an increasingly popular approach to making Bayesian inference scalable to large data sets.
We propose SwISS: Sub-posteriors with Inflation, Scaling and Shifting; a new approach for recombining the sub-posterior samples.
We prove that our transformation is optimal across a natural set of affine transformations and illustrate the efficacy of SwISS against competing algorithms on synthetic and real-world data sets.
arXiv Detail & Related papers (2022-08-08T12:02:18Z) - Bayesian Structure Learning with Generative Flow Networks [85.84396514570373]
In Bayesian structure learning, we are interested in inferring a distribution over the directed acyclic graph (DAG) from data.
Recently, a class of probabilistic models, called Generative Flow Networks (GFlowNets), have been introduced as a general framework for generative modeling.
We show that our approach, called DAG-GFlowNet, provides an accurate approximation of the posterior over DAGs.
arXiv Detail & Related papers (2022-02-28T15:53:10Z) - Unsupervised Change Detection using DRE-CUSUM [14.73895038690252]
DRE-CUSUM is an unsupervised density-ratio estimation (DRE) based approach to determine statistical changes in time-series data.
We present a theoretical justification as well as accuracy guarantees which show that the proposed statistic can reliably detect statistical changes.
We experimentally show the superiority of DRE-CUSUM using both synthetic and real-world datasets over existing state-of-the-art unsupervised algorithms.
arXiv Detail & Related papers (2022-01-27T17:25:42Z) - An Embedded Model Estimator for Non-Stationary Random Functions using
Multiple Secondary Variables [0.0]
This paper introduces the method and shows that it has consistency results that are similar in nature to those applying to geostatistical modelling and to Quantile Random Forests.
The algorithm works by estimating a conditional distribution for the target variable at each target location.
arXiv Detail & Related papers (2020-11-09T00:14:24Z) - Kernel learning approaches for summarising and combining posterior
similarity matrices [68.8204255655161]
We build upon the notion of the posterior similarity matrix (PSM) in order to suggest new approaches for summarising the output of MCMC algorithms for Bayesian clustering models.
A key contribution of our work is the observation that PSMs are positive semi-definite, and hence can be used to define probabilistically-motivated kernel matrices.
arXiv Detail & Related papers (2020-09-27T14:16:14Z) - Feature Transformation Ensemble Model with Batch Spectral Regularization
for Cross-Domain Few-Shot Classification [66.91839845347604]
We propose an ensemble prediction model by performing diverse feature transformations after a feature extraction network.
We use a batch spectral regularization term to suppress the singular values of the feature matrix during pre-training to improve the generalization ability of the model.
The proposed model can then be fine tuned in the target domain to address few-shot classification.
arXiv Detail & Related papers (2020-05-18T05:31:04Z) - Generalization of Change-Point Detection in Time Series Data Based on
Direct Density Ratio Estimation [1.929039244357139]
We show how existing algorithms can be generalized using various binary classification and regression models.
The algorithms are tested on several synthetic and real-world datasets.
arXiv Detail & Related papers (2020-01-17T15:45:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.