Sequential Gradient Descent and Quasi-Newton's Method for Change-Point
Analysis
- URL: http://arxiv.org/abs/2210.12235v1
- Date: Fri, 21 Oct 2022 20:30:26 GMT
- Title: Sequential Gradient Descent and Quasi-Newton's Method for Change-Point
Analysis
- Authors: Xianyang Zhang and Trisha Dawn
- Abstract summary: One common approach to detecting change-points is minimizing a cost function over possible numbers and locations of change-points.
This paper introduces a new sequential method (SE) that can be coupled with gradient descent (SeGD) and quasi-Newton's method (SeN) to find the cost value effectively.
- Score: 0.348097307252416
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One common approach to detecting change-points is minimizing a cost function
over possible numbers and locations of change-points. The framework includes
several well-established procedures, such as the penalized likelihood and
minimum description length. Such an approach requires finding the cost value
repeatedly over different segments of the data set, which can be time-consuming
when (i) the data sequence is long and (ii) obtaining the cost value involves
solving a non-trivial optimization problem. This paper introduces a new
sequential method (SE) that can be coupled with gradient descent (SeGD) and
quasi-Newton's method (SeN) to find the cost value effectively. The core idea
is to update the cost value using the information from previous steps without
re-optimizing the objective function. The new method is applied to change-point
detection in generalized linear models and penalized regression. Numerical
studies show that the new approach can be orders of magnitude faster than the
Pruned Exact Linear Time (PELT) method without sacrificing estimation accuracy.
Related papers
- Efficient Low-rank Identification via Accelerated Iteratively Reweighted Nuclear Norm Minimization [8.879403568685499]
We introduce an adaptive updating strategy for smoothing parameters.
This behavior transforms the algorithm into one that effectively solves problems after a few iterations.
We prove the global proposed experiment, guaranteeing that every iteration is a critical one.
arXiv Detail & Related papers (2024-06-22T02:37:13Z) - Online non-parametric likelihood-ratio estimation by Pearson-divergence
functional minimization [55.98760097296213]
We introduce a new framework for online non-parametric LRE (OLRE) for the setting where pairs of iid observations $(x_t sim p, x'_t sim q)$ are observed over time.
We provide theoretical guarantees for the performance of the OLRE method along with empirical validation in synthetic experiments.
arXiv Detail & Related papers (2023-11-03T13:20:11Z) - Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision
Processes [80.89852729380425]
We propose the first computationally efficient algorithm that achieves the nearly minimax optimal regret $tilde O(dsqrtH3K)$.
Our work provides a complete answer to optimal RL with linear MDPs, and the developed algorithm and theoretical tools may be of independent interest.
arXiv Detail & Related papers (2022-12-12T18:58:59Z) - An Accelerated Doubly Stochastic Gradient Method with Faster Explicit
Model Identification [97.28167655721766]
We propose a novel doubly accelerated gradient descent (ADSGD) method for sparsity regularized loss minimization problems.
We first prove that ADSGD can achieve a linear convergence rate and lower overall computational complexity.
arXiv Detail & Related papers (2022-08-11T22:27:22Z) - Momentum-Based Policy Gradient with Second-Order Information [40.51117836892182]
We propose a variance-reduced policy-gradient method, called SHARP, which incorporates second-order information into gradient descent.
Unlike most previous work, our proposed algorithm does not require importance sampling which can compromise the advantage of variance reduction process.
Our extensive experimental evaluations show the effectiveness of the proposed algorithm on various control tasks and its advantage over the state of the art in practice.
arXiv Detail & Related papers (2022-05-17T11:56:50Z) - Continuous-Time Meta-Learning with Forward Mode Differentiation [65.26189016950343]
We introduce Continuous Meta-Learning (COMLN), a meta-learning algorithm where adaptation follows the dynamics of a gradient vector field.
Treating the learning process as an ODE offers the notable advantage that the length of the trajectory is now continuous.
We show empirically its efficiency in terms of runtime and memory usage, and we illustrate its effectiveness on a range of few-shot image classification problems.
arXiv Detail & Related papers (2022-03-02T22:35:58Z) - Bolstering Stochastic Gradient Descent with Model Building [0.0]
gradient descent method and its variants constitute the core optimization algorithms that achieve good convergence rates.
We propose an alternative approach to line search by using a new algorithm based on forward step model building.
We show that the proposed algorithm achieves faster convergence and better generalization in well-known test problems.
arXiv Detail & Related papers (2021-11-13T06:54:36Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - Least Squares Regression with Markovian Data: Fundamental Limits and
Algorithms [69.45237691598774]
We study the problem of least squares linear regression where the data-points are dependent and are sampled from a Markov chain.
We establish sharp information theoretic minimax lower bounds for this problem in terms of $tau_mathsfmix$.
We propose an algorithm based on experience replay--a popular reinforcement learning technique--that achieves a significantly better error rate.
arXiv Detail & Related papers (2020-06-16T04:26:50Z) - Optimizing generalization on the train set: a novel gradient-based
framework to train parameters and hyperparameters simultaneously [0.0]
Generalization is a central problem in Machine Learning.
We present a novel approach based on a new measure of risk that allows us to develop novel fully automatic procedures for generalization.
arXiv Detail & Related papers (2020-06-11T18:04:36Z) - Resolving learning rates adaptively by locating Stochastic Non-Negative
Associated Gradient Projection Points using line searches [0.0]
Learning rates in neural network training are currently determined a priori to training using expensive manual or automated tuning.
This study proposes gradient-only line searches to resolve the learning rate for neural network training algorithms.
arXiv Detail & Related papers (2020-01-15T03:08:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.