Fast likelihood-based change point detection
- URL: http://arxiv.org/abs/2301.08892v1
- Date: Sat, 21 Jan 2023 05:13:35 GMT
- Title: Fast likelihood-based change point detection
- Authors: Nikolaj Tatti
- Abstract summary: We use a likelihood ratio between two models as a measure for indicating change.
Finding the optimal split can be done in $O(n)$ time, where $n$ is the number of entries since the last change point.
We propose an approximation scheme that yields $(1 - epsilon)$ approximation in $O(epsilon-1 log2 n)$ time.
- Score: 0.949290364986276
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Change point detection plays a fundamental role in many real-world
applications, where the goal is to analyze and monitor the behaviour of a data
stream. In this paper, we study change detection in binary streams. To this
end, we use a likelihood ratio between two models as a measure for indicating
change. The first model is a single bernoulli variable while the second model
divides the stored data in two segments, and models each segment with its own
bernoulli variable. Finding the optimal split can be done in $O(n)$ time, where
$n$ is the number of entries since the last change point. This is too expensive
for large $n$. To combat this we propose an approximation scheme that yields
$(1 - \epsilon)$ approximation in $O(\epsilon^{-1} \log^2 n)$ time. The
speed-up consists of several steps: First we reduce the number of possible
candidates by adopting a known result from segmentation problems. We then show
that for fixed bernoulli parameters we can find the optimal change point in
logarithmic time. Finally, we show how to construct a candidate list of size
$O(\epsilon^{-1} \log n)$ for model parameters. We demonstrate empirically the
approximation quality and the running time of our algorithm, showing that we
can gain a significant speed-up with a minimal average loss in optimality.
Related papers
- Nearly Optimal Differentially Private ReLU Regression [18.599299269974498]
We investigate one of the most fundamental non learning problems, ReLU regression, in the Differential Privacy (DP) model.
We show that it is possible to achieve an upper bound of $TildeO(fracd2N2 varepsilon2N2 varepsilon2N2 varepsilon2N2 varepsilon2N2 varepsilon2N2 varepsilon2N2 vareps
arXiv Detail & Related papers (2025-03-08T02:09:47Z) - Data subsampling for Poisson regression with pth-root-link [53.63838219437508]
We develop and analyze data subsampling techniques for Poisson regression.
In particular, we consider the Poisson generalized linear model with ID- and square root-link functions.
arXiv Detail & Related papers (2024-10-30T10:09:05Z) - Transfer Learning for Latent Variable Network Models [18.31057192626801]
We study transfer learning for estimation in latent variable network models.
We show that if the latent variables are shared, then vanishing error is possible.
Our algorithm achieves $o(1)$ error and does not assume a parametric form on the source or target networks.
arXiv Detail & Related papers (2024-06-05T16:33:30Z) - Turnstile $\ell_p$ leverage score sampling with applications [56.403488578703865]
We develop a novel algorithm for sampling rows $a_i$ of a matrix $AinmathbbRntimes d$, proportional to their $ell_p$ norm, when $A$ is presented in a turnstile data stream.
Our algorithm not only returns the set of sampled row indexes, it also returns slightly perturbed rows $tildea_i approx a_i$, and approximates their sampling probabilities up to $varepsilon$ relative error.
For logistic regression, our framework yields the first algorithm that achieves a $
arXiv Detail & Related papers (2024-06-01T07:33:41Z) - A statistical perspective on algorithm unrolling models for inverse
problems [2.7163621600184777]
In inverse problems where the conditional distribution of the observation $bf y$ given the latent variable of interest $bf x$ is known, we consider algorithm unrolling.
We show that the unrolling depth needed for the optimal statistical performance of GDNs is of order $log(n)/log(varrho_n-1)$, where $n$ is the sample size.
We also show that when the negative log-density of the latent variable $bf x$ has a simple proximal operator, then a GDN unrolled at depth $
arXiv Detail & Related papers (2023-11-10T20:52:20Z) - Data Structures for Density Estimation [66.36971978162461]
Given a sublinear (in $n$) number of samples from $p$, our main result is the first data structure that identifies $v_i$ in time sublinear in $k$.
We also give an improved version of the algorithm of Acharya et al. that reports $v_i$ in time linear in $k$.
arXiv Detail & Related papers (2023-06-20T06:13:56Z) - A Constant-per-Iteration Likelihood Ratio Test for Online Changepoint
Detection for Exponential Family Models [1.376408511310322]
FOCuS is introduced for detecting changes in mean in Gaussian data that decreases the per-iteration cost to $O(log T)$.
We show how we can adaptively perform the maximisation step of the algorithm so that we need only maximise the test statistic over a small subset of these possible locations.
arXiv Detail & Related papers (2023-02-09T16:24:12Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - Approximate Function Evaluation via Multi-Armed Bandits [51.146684847667125]
We study the problem of estimating the value of a known smooth function $f$ at an unknown point $boldsymbolmu in mathbbRn$, where each component $mu_i$ can be sampled via a noisy oracle.
We design an instance-adaptive algorithm that learns to sample according to the importance of each coordinate, and with probability at least $1-delta$ returns an $epsilon$ accurate estimate of $f(boldsymbolmu)$.
arXiv Detail & Related papers (2022-03-18T18:50:52Z) - TURF: A Two-factor, Universal, Robust, Fast Distribution Learning
Algorithm [64.13217062232874]
One of its most powerful and successful modalities approximates every distribution to an $ell$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d_$.
We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation.
arXiv Detail & Related papers (2022-02-15T03:49:28Z) - Optimal network online change point localisation [73.93301212629231]
We study the problem of online network change point detection.
In this setting, a collection of independent Bernoulli networks is collected sequentially, and the underlying change point occurs.
The goal is to detect the change point as quickly as possible, if it exists, subject to a constraint on the number or probability of false alarms.
arXiv Detail & Related papers (2021-01-14T07:24:39Z) - SoS Degree Reduction with Applications to Clustering and Robust Moment
Estimation [3.6042575355093907]
We develop a general framework to significantly reduce the degree of sum-of-square proofs by introducing new variables.
We use it to speed up previous algorithms based on sum-of-squares for two important estimation problems, clustering and robust moment estimation.
arXiv Detail & Related papers (2021-01-05T13:49:59Z) - Fair and Representative Subset Selection from Data Streams [4.53279507109072]
We consider the setting where data items in the stream belong to one of several disjoint groups.
We propose efficient algorithms for the fairness-aware variant of the streaming submodular problem.
arXiv Detail & Related papers (2020-10-09T07:49:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.