Related papers: SONIA: A Symmetric Blockwise Truncated Optimization Algorithm

SONIA: A Symmetric Blockwise Truncated Optimization Algorithm

URL: http://arxiv.org/abs/2006.03949v1
Date: Sat, 6 Jun 2020 19:28:14 GMT
Title: SONIA: A Symmetric Blockwise Truncated Optimization Algorithm
Authors: Majid Jahani, Mohammadreza Nazari, Rachael Tappenden, Albert S. Berahas, Martin Tak\'a\v{c}
Abstract summary: This work presents a new algorithm for empirical risk. The algorithm bridges the gap between first- and second-order search methods by computing a second-order search-type update in one subspace, coupled with a scaled steepest descent step in the Theoretical complement.
Score: 2.9923891863939938
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This work presents a new algorithm for empirical risk minimization. The algorithm bridges the gap between first- and second-order methods by computing a search direction that uses a second-order-type update in one subspace, coupled with a scaled steepest descent step in the orthogonal complement. To this end, partial curvature information is incorporated to help with ill-conditioning, while simultaneously allowing the algorithm to scale to the large problem dimensions often encountered in machine learning applications. Theoretical results are presented to confirm that the algorithm converges to a stationary point in both the strongly convex and nonconvex cases. A stochastic variant of the algorithm is also presented, along with corresponding theoretical guarantees. Numerical results confirm the strengths of the new approach on standard machine learning problems.

Related papers

Robust empirical risk minimization via Newton's method [9.797319790710711]
A new variant of Newton's method for empirical risk minimization is studied. The gradient and Hessian of the objective function are replaced by robust estimators. An algorithm for obtaining robust Newton directions based on the conjugate gradient method is also proposed.
arXiv Detail & Related papers (2023-01-30T18:54:54Z)
Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods [75.34939761152587]
Efficient computation of the optimal transport distance between two distributions serves as an algorithm that empowers various applications. This paper develops a scalable first-order optimization-based method that computes optimal transport to within $varepsilon$ additive accuracy.
arXiv Detail & Related papers (2023-01-30T15:46:39Z)
Reinforcement Learning with Unbiased Policy Evaluation and Linear Function Approximation [11.345796608258434]
We provide performance guarantees for a variant of simulation-based policy iteration for controlling Markov decision processes. We analyze two algorithms; the first algorithm involves a least squares approach where a new set of weights associated with feature vectors is obtained via at least squares at each iteration. The second algorithm involves a two-time-scale approximation algorithm taking several steps of gradient descent towards the least squares solution.
arXiv Detail & Related papers (2022-10-13T20:16:19Z)
Fast Projected Newton-like Method for Precision Matrix Estimation under Total Positivity [15.023842222803058]
Current algorithms are designed using the block coordinate descent method or the proximal point algorithm. We propose a novel algorithm based on the two-metric projection method, incorporating a carefully designed search direction and variable partitioning scheme. Experimental results on synthetic and real-world datasets demonstrate that our proposed algorithm provides a significant improvement in computational efficiency compared to the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-03T14:39:10Z)
Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks [79.16773494166644]
We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network. We design two optimal algorithms that attain these lower bounds. We corroborate the theoretical efficiency of these algorithms by performing an experimental comparison with existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-08T15:54:44Z)
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step. Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z)
Accelerated Message Passing for Entropy-Regularized MAP Inference [89.15658822319928]
Maximum a posteriori (MAP) inference in discrete-valued random fields is a fundamental problem in machine learning. Due to the difficulty of this problem, linear programming (LP) relaxations are commonly used to derive specialized message passing algorithms. We present randomized methods for accelerating these algorithms by leveraging techniques that underlie classical accelerated gradient.
arXiv Detail & Related papers (2020-07-01T18:43:32Z)
Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching. We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)
Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems. We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms. Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.