An Evaluation of Real-time Adaptive Sampling Change Point Detection Algorithm using KCUSUM
- URL: http://arxiv.org/abs/2402.10291v2
- Date: Fri, 5 Apr 2024 03:09:07 GMT
- Title: An Evaluation of Real-time Adaptive Sampling Change Point Detection Algorithm using KCUSUM
- Authors: Vijayalakshmi Saravanan, Perry Siehien, Shinjae Yoo, Hubertus Van Dam, Thomas Flynn, Christopher Kelly, Khaled Z Ibrahim,
- Abstract summary: We introduce the Kernel-based Cumulative Sum (KCUSUM) algorithm, a non-parametric extension of the traditional Cumulative Sum (CUSUM) method.
KCUSUM splits itself by comparing incoming samples directly with reference samples and computes a statistic grounded in the Maximum Mean Discrepancy (MMD) non-parametric framework.
We discuss real-world use cases from scientific simulations such as NWChem CODAR and protein folding data, demonstrating KCUSUM's practical effectiveness in online change point detection.
- Score: 4.610597418629838
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting abrupt changes in real-time data streams from scientific simulations presents a challenging task, demanding the deployment of accurate and efficient algorithms. Identifying change points in live data stream involves continuous scrutiny of incoming observations for deviations in their statistical characteristics, particularly in high-volume data scenarios. Maintaining a balance between sudden change detection and minimizing false alarms is vital. Many existing algorithms for this purpose rely on known probability distributions, limiting their feasibility. In this study, we introduce the Kernel-based Cumulative Sum (KCUSUM) algorithm, a non-parametric extension of the traditional Cumulative Sum (CUSUM) method, which has gained prominence for its efficacy in online change point detection under less restrictive conditions. KCUSUM splits itself by comparing incoming samples directly with reference samples and computes a statistic grounded in the Maximum Mean Discrepancy (MMD) non-parametric framework. This approach extends KCUSUM's pertinence to scenarios where only reference samples are available, such as atomic trajectories of proteins in vacuum, facilitating the detection of deviations from the reference sample without prior knowledge of the data's underlying distribution. Furthermore, by harnessing MMD's inherent random-walk structure, we can theoretically analyze KCUSUM's performance across various use cases, including metrics like expected delay and mean runtime to false alarms. Finally, we discuss real-world use cases from scientific simulations such as NWChem CODAR and protein folding data, demonstrating KCUSUM's practical effectiveness in online change point detection.
Related papers
- Asymptotically Optimal Change Detection for Unnormalized Pre- and Post-Change Distributions [65.38208224389027]
This paper addresses the problem of detecting changes when only unnormalized pre- and post-change distributions are accessible.
Our approach is based on the estimation of the Cumulative Sum statistics, which is known to produce optimal performance.
arXiv Detail & Related papers (2024-10-18T17:13:29Z) - Reproduction of scan B-statistic for kernel change-point detection algorithm [10.49860279555873]
Change-point detection has garnered significant attention due to its broad range of applications.
In this paper, we reproduce a recently proposed online change-point detection algorithm based on an efficient kernel-based scan B-statistic.
Our numerical experiments demonstrate that the scan B-statistic consistently delivers superior performance.
arXiv Detail & Related papers (2024-08-23T15:12:31Z) - Partially-Observable Sequential Change-Point Detection for Autocorrelated Data via Upper Confidence Region [12.645304808491309]
We propose a detection scheme called adaptive upper confidence region with state space model (AUCRSS) for sequential change point detection.
A partially-observable Kalman filter algorithm is developed for online inference of SSM, and accordingly, a change point detection scheme based on a generalized likelihood ratio test is analyzed.
arXiv Detail & Related papers (2024-03-30T02:32:53Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Neural network-based CUSUM for online change-point detection [17.098858682219866]
We introduce a neural network CUSUM (NN-CUSUM) for online change-point detection.
We present a general theoretical condition when the trained neural networks can perform change-point detection.
The strong performance of NN-CUSUM is demonstrated in detecting change-point in high-dimensional data.
arXiv Detail & Related papers (2022-10-31T16:47:11Z) - Unsupervised Change Detection using DRE-CUSUM [14.73895038690252]
DRE-CUSUM is an unsupervised density-ratio estimation (DRE) based approach to determine statistical changes in time-series data.
We present a theoretical justification as well as accuracy guarantees which show that the proposed statistic can reliably detect statistical changes.
We experimentally show the superiority of DRE-CUSUM using both synthetic and real-world datasets over existing state-of-the-art unsupervised algorithms.
arXiv Detail & Related papers (2022-01-27T17:25:42Z) - Convolutional generative adversarial imputation networks for
spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods.
We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z) - Partially Observable Online Change Detection via Smooth-Sparse
Decomposition [16.8028358824706]
We consider online change detection of high dimensional data streams with sparse changes, where only a subset of data streams can be observed at each sensing time point due to limited sensing capacities.
On the one hand, the detection scheme should be able to deal with partially observable data and meanwhile have efficient detection power for sparse changes.
In this paper, we propose a novel detection scheme called CDSSD. In particular, it describes the structure of high dimensional data with sparse changes by smooth-sparse decomposition.
arXiv Detail & Related papers (2020-09-22T16:03:04Z) - Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions.
We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test.
Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - A Robust Functional EM Algorithm for Incomplete Panel Count Data [66.07942227228014]
We propose a functional EM algorithm to estimate the counting process mean function under a missing completely at random assumption (MCAR)
The proposed algorithm wraps several popular panel count inference methods, seamlessly deals with incomplete counts and is robust to misspecification of the Poisson process assumption.
We illustrate the utility of the proposed algorithm through numerical experiments and an analysis of smoking cessation data.
arXiv Detail & Related papers (2020-03-02T20:04:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.