Related papers: Fast likelihood-based change point detection

Fast likelihood-based change point detection

URL: http://arxiv.org/abs/2301.08892v1
Date: Sat, 21 Jan 2023 05:13:35 GMT
Title: Fast likelihood-based change point detection
Authors: Nikolaj Tatti
Abstract summary: We use a likelihood ratio between two models as a measure for indicating change. Finding the optimal split can be done in $O(n)$ time, where $n$ is the number of entries since the last change point. We propose an approximation scheme that yields $(1 - epsilon)$ approximation in $O(epsilon-1 log2 n)$ time.
Score: 0.949290364986276
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Change point detection plays a fundamental role in many real-world applications, where the goal is to analyze and monitor the behaviour of a data stream. In this paper, we study change detection in binary streams. To this end, we use a likelihood ratio between two models as a measure for indicating change. The first model is a single bernoulli variable while the second model divides the stored data in two segments, and models each segment with its own bernoulli variable. Finding the optimal split can be done in $O(n)$ time, where $n$ is the number of entries since the last change point. This is too expensive for large $n$. To combat this we propose an approximation scheme that yields $(1 - \epsilon)$ approximation in $O(\epsilon^{-1} \log^2 n)$ time. The speed-up consists of several steps: First we reduce the number of possible candidates by adopting a known result from segmentation problems. We then show that for fixed bernoulli parameters we can find the optimal change point in logarithmic time. Finally, we show how to construct a candidate list of size $O(\epsilon^{-1} \log n)$ for model parameters. We demonstrate empirically the approximation quality and the running time of our algorithm, showing that we can gain a significant speed-up with a minimal average loss in optimality.

Related papers

Nearly Optimal Differentially Private ReLU Regression [18.599299269974498]
We investigate one of the most fundamental non learning problems, ReLU regression, in the Differential Privacy (DP) model. We show that it is possible to achieve an upper bound of $TildeO(fracd2N2 varepsilon2N2 varepsilon2N2 varepsilon2N2 varepsilon2N2 varepsilon2N2 varepsilon2N2 vareps
arXiv Detail & Related papers (2025-03-08T02:09:47Z)
Data subsampling for Poisson regression with pth-root-link [53.63838219437508]
We develop and analyze data subsampling techniques for Poisson regression. In particular, we consider the Poisson generalized linear model with ID- and square root-link functions.
arXiv Detail & Related papers (2024-10-30T10:09:05Z)
Transfer Learning for Latent Variable Network Models [18.31057192626801]
We study transfer learning for estimation in latent variable network models. We show that if the latent variables are shared, then vanishing error is possible. Our algorithm achieves $o(1)$ error and does not assume a parametric form on the source or target networks.
arXiv Detail & Related papers (2024-06-05T16:33:30Z)
Turnstile $\ell_p$ leverage score sampling with applications [56.403488578703865]
We develop a novel algorithm for sampling rows $a_i$ of a matrix $AinmathbbRntimes d$, proportional to their $ell_p$ norm, when $A$ is presented in a turnstile data stream. Our algorithm not only returns the set of sampled row indexes, it also returns slightly perturbed rows $tildea_i approx a_i$, and approximates their sampling probabilities up to $varepsilon$ relative error. For logistic regression, our framework yields the first algorithm that achieves a $
arXiv Detail & Related papers (2024-06-01T07:33:41Z)
A statistical perspective on algorithm unrolling models for inverse problems [2.7163621600184777]
In inverse problems where the conditional distribution of the observation $bf y$ given the latent variable of interest $bf x$ is known, we consider algorithm unrolling. We show that the unrolling depth needed for the optimal statistical performance of GDNs is of order $log(n)/log(varrho_n-1)$, where $n$ is the sample size. We also show that when the negative log-density of the latent variable $bf x$ has a simple proximal operator, then a GDN unrolled at depth $
arXiv Detail & Related papers (2023-11-10T20:52:20Z)
Data Structures for Density Estimation [66.36971978162461]
Given a sublinear (in $n$) number of samples from $p$, our main result is the first data structure that identifies $v_i$ in time sublinear in $k$. We also give an improved version of the algorithm of Acharya et al. that reports $v_i$ in time linear in $k$.
arXiv Detail & Related papers (2023-06-20T06:13:56Z)
A Constant-per-Iteration Likelihood Ratio Test for Online Changepoint Detection for Exponential Family Models [1.376408511310322]
FOCuS is introduced for detecting changes in mean in Gaussian data that decreases the per-iteration cost to $O(log T)$. We show how we can adaptively perform the maximisation step of the algorithm so that we need only maximise the test statistic over a small subset of these possible locations.
arXiv Detail & Related papers (2023-02-09T16:24:12Z)
Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model. The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z)
Approximate Function Evaluation via Multi-Armed Bandits [51.146684847667125]
We study the problem of estimating the value of a known smooth function $f$ at an unknown point $boldsymbolmu in mathbbRn$, where each component $mu_i$ can be sampled via a noisy oracle. We design an instance-adaptive algorithm that learns to sample according to the importance of each coordinate, and with probability at least $1-delta$ returns an $epsilon$ accurate estimate of $f(boldsymbolmu)$.
arXiv Detail & Related papers (2022-03-18T18:50:52Z)
TURF: A Two-factor, Universal, Robust, Fast Distribution Learning Algorithm [64.13217062232874]
One of its most powerful and successful modalities approximates every distribution to an $ell$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d_$. We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation.
arXiv Detail & Related papers (2022-02-15T03:49:28Z)
Optimal network online change point localisation [73.93301212629231]
We study the problem of online network change point detection. In this setting, a collection of independent Bernoulli networks is collected sequentially, and the underlying change point occurs. The goal is to detect the change point as quickly as possible, if it exists, subject to a constraint on the number or probability of false alarms.
arXiv Detail & Related papers (2021-01-14T07:24:39Z)
SoS Degree Reduction with Applications to Clustering and Robust Moment Estimation [3.6042575355093907]
We develop a general framework to significantly reduce the degree of sum-of-square proofs by introducing new variables. We use it to speed up previous algorithms based on sum-of-squares for two important estimation problems, clustering and robust moment estimation.
arXiv Detail & Related papers (2021-01-05T13:49:59Z)
Fair and Representative Subset Selection from Data Streams [4.53279507109072]
We consider the setting where data items in the stream belong to one of several disjoint groups. We propose efficient algorithms for the fairness-aware variant of the streaming submodular problem.
arXiv Detail & Related papers (2020-10-09T07:49:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.