The Cost of Learning under Multiple Change Points
- URL: http://arxiv.org/abs/2602.11406v1
- Date: Wed, 11 Feb 2026 22:16:20 GMT
- Title: The Cost of Learning under Multiple Change Points
- Authors: Tomer Gafni, Garud Iyengar, Assaf Zeevi,
- Abstract summary: We show that classical methods may exhibit catastrophic failure (high regret) due to a phenomenon we refer to as endogenous confounding.<n>We propose a new class of learning algorithms dubbed Anytime Tracking CUSUM (ATC)<n>These are horizon-free online algorithms that implement a selective detection principle.
- Score: 18.874638937505434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider an online learning problem in environments with multiple change points. In contrast to the single change point problem that is widely studied using classical "high confidence" detection schemes, the multiple change point environment presents new learning-theoretic and algorithmic challenges. Specifically, we show that classical methods may exhibit catastrophic failure (high regret) due to a phenomenon we refer to as endogenous confounding. To overcome this, we propose a new class of learning algorithms dubbed Anytime Tracking CUSUM (ATC). These are horizon-free online algorithms that implement a selective detection principle, balancing the need to ignore "small" (hard-to-detect) shifts, while reacting "quickly" to significant ones. We prove that the performance of a properly tuned ATC algorithm is nearly minimax-optimal; its regret is guaranteed to closely match a novel information-theoretic lower bound on the achievable performance of any learning algorithm in the multiple change point problem. Experiments on synthetic as well as real-world data validate the aforementioned theoretical findings.
Related papers
- The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF.
Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples.
In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z) - Smoothed Online Learning for Prediction in Piecewise Affine Systems [43.64498536409903]
This paper builds on the recently developed smoothed online learning framework.
It provides the first algorithms for prediction and simulation in piecewise affine systems.
arXiv Detail & Related papers (2023-01-26T15:54:14Z) - Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data.
In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z) - Algorithms that Approximate Data Removal: New Results and Limitations [2.6905021039717987]
We study the problem of deleting user data from machine learning models trained using empirical risk minimization.
We develop an online unlearning algorithm that is both computationally and memory efficient.
arXiv Detail & Related papers (2022-09-25T17:20:33Z) - Towards Diverse Evaluation of Class Incremental Learning: A Representation Learning Perspective [67.45111837188685]
Class incremental learning (CIL) algorithms aim to continually learn new object classes from incrementally arriving data.
We experimentally analyze neural network models trained by CIL algorithms using various evaluation protocols in representation learning.
arXiv Detail & Related papers (2022-06-16T11:44:11Z) - Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise
Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances.
Online descent (OGD) is a popular approach to handle streaming data in pairwise learning.
In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z) - Online estimation and control with optimal pathlength regret [52.28457815067461]
A natural goal when designing online learning algorithms is to bound the regret of the algorithm in terms of the temporal variation of the input sequence.
Data-dependent "pathlength" regret bounds have recently been obtained for a wide variety of online learning problems, including OCO and bandits.
arXiv Detail & Related papers (2021-10-24T22:43:15Z) - AlterSGD: Finding Flat Minima for Continual Learning by Alternative
Training [11.521519687645428]
We propose a simple yet effective optimization method, called AlterSGD, to search for a flat minima in the loss landscape.
We prove that such a strategy can encourage the optimization to converge to a flat minima.
We verify AlterSGD on continual learning benchmark for semantic segmentation and the empirical results show that we can significantly mitigate the forgetting.
arXiv Detail & Related papers (2021-07-13T01:43:51Z) - SID: Incremental Learning for Anchor-Free Object Detection via Selective
and Inter-Related Distillation [16.281712605385316]
Incremental learning requires a model to continually learn new tasks from streaming data.
Traditional fine-tuning of a well-trained deep neural network on a new task will dramatically degrade performance on the old task.
We propose a novel incremental learning paradigm called Selective and Inter-related Distillation (SID)
arXiv Detail & Related papers (2020-12-31T04:12:06Z) - Comparative Analysis of Extreme Verification Latency Learning Algorithms [3.3439097577935213]
This paper is a comprehensive survey and comparative analysis of some of the EVL algorithms to point out the weaknesses and strengths of different approaches.
This work is a very first effort to provide a review of some of the existing algorithms in this field to the research community.
arXiv Detail & Related papers (2020-11-26T16:34:56Z) - ARCADe: A Rapid Continual Anomaly Detector [25.34227775187408]
We address a novel learning problem of continual anomaly detection (CAD)
We propose ARCADe, an approach to train neural networks to be robust against the major challenges of this new learning problem.
The results of our experiments on three datasets show that ARCADe substantially outperforms baselines from the continual learning and anomaly detection literature.
arXiv Detail & Related papers (2020-08-10T11:59:32Z) - Reparameterized Variational Divergence Minimization for Stable Imitation [57.06909373038396]
We study the extent to which variations in the choice of probabilistic divergence may yield more performant ILO algorithms.
We contribute a re parameterization trick for adversarial imitation learning to alleviate the challenges of the promising $f$-divergence minimization framework.
Empirically, we demonstrate that our design choices allow for ILO algorithms that outperform baseline approaches and more closely match expert performance in low-dimensional continuous-control tasks.
arXiv Detail & Related papers (2020-06-18T19:04:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.