Related papers: Online Heavy-tailed Change-point detection

Online Heavy-tailed Change-point detection

URL: http://arxiv.org/abs/2306.09548v2
Date: Mon, 3 Jul 2023 17:56:15 GMT
Title: Online Heavy-tailed Change-point detection
Authors: Abishek Sankararaman and Balakrishnan (Murali) Narayanaswamy
Abstract summary: We present an algorithm based on clipped Gradient Descent (SGD), that works even if we only assume that the second moment of the data generating process is bounded. We derive guarantees on worst-case, finite-sample false-positive rate (FPR) over the family of all distributions with bounded second moment. Our method is the first OCPD algorithm that guarantees finite-sample FPR, even if the data is high dimensional and the underlying distributions are heavy-tailed.
Score: 6.7643284029102295
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study algorithms for online change-point detection (OCPD), where samples that are potentially heavy-tailed, are presented one at a time and a change in the underlying mean must be detected as early as possible. We present an algorithm based on clipped Stochastic Gradient Descent (SGD), that works even if we only assume that the second moment of the data generating process is bounded. We derive guarantees on worst-case, finite-sample false-positive rate (FPR) over the family of all distributions with bounded second moment. Thus, our method is the first OCPD algorithm that guarantees finite-sample FPR, even if the data is high dimensional and the underlying distributions are heavy-tailed. The technical contribution of our paper is to show that clipped-SGD can estimate the mean of a random vector and simultaneously provide confidence bounds at all confidence values. We combine this robust estimate with a union bound argument and construct a sequential change-point algorithm with finite-sample FPR guarantees. We show empirically that our algorithm works well in a variety of situations, whether the underlying data are heavy-tailed, light-tailed, high dimensional or discrete. No other algorithm achieves bounded FPR theoretically or empirically, over all settings we study simultaneously.

Related papers

Benchmarking Fraud Detectors on Private Graph Data [70.4654745317714]
Currently, many types of fraud are managed in part by automated detection algorithms that operate over graphs.<n>We consider the scenario where a data holder wishes to outsource development of fraud detectors to third parties.<n>Third parties submit their fraud detectors to the data holder, who evaluates these algorithms on a private dataset and then publicly communicates the results.<n>We propose a realistic privacy attack on this system that allows an adversary to de-anonymize individuals' data based only on the evaluation results.
arXiv Detail & Related papers (2025-07-30T03:20:15Z)
Asymptotically Optimal Linear Best Feasible Arm Identification with Fixed Budget [55.938644481736446]
We introduce a novel algorithm for best feasible arm identification that guarantees an exponential decay in the error probability.<n>We validate our algorithm through comprehensive empirical evaluations across various problem instances with different levels of complexity.
arXiv Detail & Related papers (2025-06-03T02:56:26Z)
Optimal Online Change Detection via Random Fourier Features [0.0]
This article studies the problem of online non-parametric change point detection in multivariate data streams.<n>We introduce a sequential testing procedure based on random Fourier features, running with logarithmic time complexity per observation.<n>We prove strong theoretical guarantees on the algorithm's performance, including information-theoretic bounds demonstrating that the detection delay is optimal in the minimax sense.
arXiv Detail & Related papers (2025-05-23T12:02:13Z)
Inference for an Algorithmic Fairness-Accuracy Frontier [0.9147443443422864]
We provide a consistent estimator for a theoretical fairness-accuracy frontier put forward by Liang, Lu and Mu (2023) We propose inference methods to test hypotheses that have received much attention in the fairness literature. We show that the estimated support function converges to a tight process as the sample size increases.
arXiv Detail & Related papers (2024-02-14T00:56:09Z)
Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data. In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z)
QuTE: decentralized multiple testing on sensor networks with false discovery rate control [130.7122910646076]
This paper designs methods for decentralized multiple hypothesis testing on graphs equipped with provable guarantees on the false discovery rate (FDR) We consider the setting where distinct agents reside on the nodes of an undirected graph, and each agent possesses p-values corresponding to one or more hypotheses local to its node. Each agent must individually decide whether to reject one or more of its local hypotheses by only communicating with its neighbors, with the joint aim that the global FDR over the entire graph must be controlled at a predefined level.
arXiv Detail & Related papers (2022-10-09T19:48:39Z)
Differentially Private Federated Learning via Inexact ADMM with Multiple Local Updates [0.0]
We develop a DP inexact alternating direction method of multipliers algorithm with multiple local updates for federated learning. We show that our algorithm provides $barepsilon$-DP for every iteration, where $barepsilon$ is a privacy budget controlled by the user. We demonstrate that our algorithm reduces the testing error by at most $31%$ compared with the existing DP algorithm, while achieving the same level of data privacy.
arXiv Detail & Related papers (2022-02-18T19:58:47Z)
Uniform-PAC Bounds for Reinforcement Learning with Linear Function Approximation [92.3161051419884]
We study reinforcement learning with linear function approximation. Existing algorithms only have high-probability regret and/or Probably Approximately Correct (PAC) sample complexity guarantees. We propose a new algorithm called FLUTE, which enjoys uniform-PAC convergence to the optimal policy with high probability.
arXiv Detail & Related papers (2021-06-22T08:48:56Z)
Differentially Private Federated Learning via Inexact ADMM [0.0]
Differential privacy (DP) techniques can be applied to the federated learning model to protect data privacy against inference attacks. We develop a DP inexact alternating direction method of multipliers algorithm that solves a sequence of trust-region subproblems. Our algorithm reduces the testing error by at most $22%$ compared with the existing DP algorithm, while achieving the same level of data privacy.
arXiv Detail & Related papers (2021-06-11T02:28:07Z)
Fast and Robust Online Inference with Stochastic Gradient Descent via Random Scaling [0.9806910643086042]
We develop a new method of online inference for a vector of parameters estimated by the Polyak-Rtupper averaging procedure of gradient descent algorithms. Our approach is fully operational with online data and is rigorously underpinned by a functional central limit theorem.
arXiv Detail & Related papers (2021-06-06T15:38:37Z)
T-SCI: A Two-Stage Conformal Inference Algorithm with Guaranteed Coverage for Cox-MLP [13.4379473119565]
We propose two algorithms for recovering guaranteed coverage in censored data. First, we revisit weighted conformal inference and introduce a new non-conformity score based on partial likelihood. We then propose a two-stage algorithm emphT-SCI, where we run WCCI in the first stage and apply quantile conformal inference to calibrate the results in the second stage.
arXiv Detail & Related papers (2021-03-08T05:42:05Z)
On the Practicality of Differential Privacy in Federated Learning by Tuning Iteration Times [51.61278695776151]
Federated Learning (FL) is well known for its privacy protection when training machine learning models among distributed clients collaboratively. Recent studies have pointed out that the naive FL is susceptible to gradient leakage attacks. Differential Privacy (DP) emerges as a promising countermeasure to defend against gradient leakage attacks.
arXiv Detail & Related papers (2021-01-11T19:43:12Z)
Adaptive Sampling for Best Policy Identification in Markov Decision Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model. The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z)
Determinantal Point Processes in Randomized Numerical Linear Algebra [80.27102478796613]
Numerical Linear Algebra (RandNLA) uses randomness to develop improved algorithms for matrix problems that arise in scientific computing, data science, machine learning, etc. Recent work has uncovered deep and fruitful connections between DPPs and RandNLA which lead to new guarantees and improved algorithms.
arXiv Detail & Related papers (2020-05-07T00:39:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.