Linear time dynamic programming for the exact path of optimal models
selected from a finite set
- URL: http://arxiv.org/abs/2003.02808v1
- Date: Thu, 5 Mar 2020 18:16:58 GMT
- Title: Linear time dynamic programming for the exact path of optimal models
selected from a finite set
- Authors: Toby Hocking, Joseph Vargovich
- Abstract summary: We show that a dynamic programming algorithm can be used to compute the breakpoints in linear time.
Empirical results on changepoint detection problems demonstrate the improved accuracy and speed.
- Score: 0.38073142980732994
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many learning algorithms are formulated in terms of finding model parameters
which minimize a data-fitting loss function plus a regularizer. When the
regularizer involves the l0 pseudo-norm, the resulting regularization path
consists of a finite set of models. The fastest existing algorithm for
computing the breakpoints in the regularization path is quadratic in the number
of models, so it scales poorly to high dimensional problems. We provide new
formal proofs that a dynamic programming algorithm can be used to compute the
breakpoints in linear time. Empirical results on changepoint detection problems
demonstrate the improved accuracy and speed relative to grid search and the
previous quadratic time algorithm.
Related papers
- Towards Learning High-Precision Least Squares Algorithms with Sequence Models [42.217390215093516]
We show that softmax Transformers struggle to perform high-precision multiplications.
We identify limitations present in existing architectures and training procedures.
For the first time, we demonstrate the ability to train to near machine precision.
arXiv Detail & Related papers (2025-03-15T23:25:11Z) - Efficient Numerical Algorithm for Large-Scale Damped Natural Gradient
Descent [7.368877979221163]
We propose a new algorithm for efficiently solving the damped Fisher matrix in large-scale scenarios where the number of parameters significantly exceeds the number of available samples.
Our algorithm is based on Cholesky decomposition and is generally applicable. Benchmark results show that the algorithm is significantly faster than existing methods.
arXiv Detail & Related papers (2023-10-26T16:46:13Z) - Fast Screening Rules for Optimal Design via Quadratic Lasso
Reformulation [0.135975510645475]
In this work, we derive safe screening rules that can be used to discard inessential samples.
The new tests are much faster to compute, especially for problems involving a parameter space of high dimension.
We show how an existing homotopy algorithm to compute the regularization path of the lasso method can be reparametrized with respect to the squared $ell_$-penalty.
arXiv Detail & Related papers (2023-10-13T08:10:46Z) - Best-Subset Selection in Generalized Linear Models: A Fast and
Consistent Algorithm via Splicing Technique [0.6338047104436422]
Best subset section has been widely regarded as the Holy Grail of problems of this type.
We proposed and illustrated an algorithm for best subset recovery in mild conditions.
Our implementation achieves approximately a fourfold speedup compared to popular variable selection toolkits.
arXiv Detail & Related papers (2023-08-01T03:11:31Z) - Learning the Positions in CountSketch [49.57951567374372]
We consider sketching algorithms which first compress data by multiplication with a random sketch matrix, and then apply the sketch to quickly solve an optimization problem.
In this work, we propose the first learning-based algorithms that also optimize the locations of the non-zero entries.
arXiv Detail & Related papers (2023-06-11T07:28:35Z) - Online Learning Under A Separable Stochastic Approximation Framework [20.26530917721778]
We propose an online learning algorithm for a class of machine learning models under a separable approximation framework.
We show that the proposed algorithm produces more robust and test performance when compared to other popular learning algorithms.
arXiv Detail & Related papers (2023-05-12T13:53:03Z) - Implicit Parameter-free Online Learning with Truncated Linear Models [51.71216912089413]
parameter-free algorithms are online learning algorithms that do not require setting learning rates.
We propose new parameter-free algorithms that can take advantage of truncated linear models through a new update that has an "implicit" flavor.
Based on a novel decomposition of the regret, the new update is efficient, requires only one gradient at each step, never overshoots the minimum of the truncated model, and retains the favorable parameter-free properties.
arXiv Detail & Related papers (2022-03-19T13:39:49Z) - Continuation Path with Linear Convergence Rate [18.405645120971496]
Path-following algorithms are frequently used in composite optimization problems where a series of subproblems are solved sequentially.
We present a primal dual analysis of the path-following algorithm as well as determining how accurately each subproblem should be solved to guarantee a linear convergence rate on a target problem.
Considering optimization with a sparsity-inducing penalty, we analyze the change of the active sets with respect to the regularization parameter.
arXiv Detail & Related papers (2021-12-09T18:42:13Z) - Online estimation and control with optimal pathlength regret [52.28457815067461]
A natural goal when designing online learning algorithms is to bound the regret of the algorithm in terms of the temporal variation of the input sequence.
Data-dependent "pathlength" regret bounds have recently been obtained for a wide variety of online learning problems, including OCO and bandits.
arXiv Detail & Related papers (2021-10-24T22:43:15Z) - Machine Learning for Online Algorithm Selection under Censored Feedback [71.6879432974126]
In online algorithm selection (OAS), instances of an algorithmic problem class are presented to an agent one after another, and the agent has to quickly select a presumably best algorithm from a fixed set of candidate algorithms.
For decision problems such as satisfiability (SAT), quality typically refers to the algorithm's runtime.
In this work, we revisit multi-armed bandit algorithms for OAS and discuss their capability of dealing with the problem.
We adapt them towards runtime-oriented losses, allowing for partially censored data while keeping a space- and time-complexity independent of the time horizon.
arXiv Detail & Related papers (2021-09-13T18:10:52Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - Accelerated Message Passing for Entropy-Regularized MAP Inference [89.15658822319928]
Maximum a posteriori (MAP) inference in discrete-valued random fields is a fundamental problem in machine learning.
Due to the difficulty of this problem, linear programming (LP) relaxations are commonly used to derive specialized message passing algorithms.
We present randomized methods for accelerating these algorithms by leveraging techniques that underlie classical accelerated gradient.
arXiv Detail & Related papers (2020-07-01T18:43:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.