Coordinate Linear Variance Reduction for Generalized Linear Programming
- URL: http://arxiv.org/abs/2111.01842v4
- Date: Thu, 6 Apr 2023 21:55:13 GMT
- Title: Coordinate Linear Variance Reduction for Generalized Linear Programming
- Authors: Chaobing Song, Cheuk Yin Lin, Stephen J. Wright, Jelena Diakonikolas
- Abstract summary: We show that the linear structure in the problem can be used to design an efficient, scalable first-order algorithm.
textscclvr yields improved complexity results for (GLP) that depend on the max row norm of the linear constraint matrix in (GLP) rather than the spectral norm.
- Score: 27.365677554732304
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study a class of generalized linear programs (GLP) in a large-scale
setting, which includes simple, possibly nonsmooth convex regularizer and
simple convex set constraints. By reformulating (GLP) as an equivalent
convex-concave min-max problem, we show that the linear structure in the
problem can be used to design an efficient, scalable first-order algorithm, to
which we give the name \emph{Coordinate Linear Variance Reduction}
(\textsc{clvr}; pronounced "clever"). \textsc{clvr} yields improved complexity
results for (GLP) that depend on the max row norm of the linear constraint
matrix in (GLP) rather than the spectral norm. When the regularization terms
and constraints are separable, \textsc{clvr} admits an efficient lazy update
strategy that makes its complexity bounds scale with the number of nonzero
elements of the linear constraint matrix in (GLP) rather than the matrix
dimensions. On the other hand, for the special case of linear programs, by
exploiting sharpness, we propose a restart scheme for \textsc{clvr} to obtain
empirical linear convergence. Then we show that Distributionally Robust
Optimization (DRO) problems with ambiguity sets based on both $f$-divergence
and Wasserstein metrics can be reformulated as (GLPs) by introducing sparsely
connected auxiliary variables. We complement our theoretical guarantees with
numerical experiments that verify our algorithm's practical effectiveness, in
terms of wall-clock time and number of data passes.
Related papers
- Recovering Simultaneously Structured Data via Non-Convex Iteratively
Reweighted Least Squares [0.8702432681310401]
We propose a new algorithm for recovering data that adheres to multiple, heterogeneous low-dimensional structures from linear observations.
We show that the IRLS method favorable in identifying low/comckuele state measurements.
arXiv Detail & Related papers (2023-06-08T06:35:47Z) - Constrained Optimization via Exact Augmented Lagrangian and Randomized
Iterative Sketching [55.28394191394675]
We develop an adaptive inexact Newton method for equality-constrained nonlinear, nonIBS optimization problems.
We demonstrate the superior performance of our method on benchmark nonlinear problems, constrained logistic regression with data from LVM, and a PDE-constrained problem.
arXiv Detail & Related papers (2023-05-28T06:33:37Z) - An inexact LPA for DC composite optimization and application to matrix completions with outliers [5.746154410100363]
This paper concerns a class of composite optimization problems.
By leveraging the composite structure, we provide a condition for the potential function to have the KL property of $1/2$ at the iterate sequence.
arXiv Detail & Related papers (2023-03-29T16:15:34Z) - High-Dimensional Sparse Bayesian Learning without Covariance Matrices [66.60078365202867]
We introduce a new inference scheme that avoids explicit construction of the covariance matrix.
Our approach couples a little-known diagonal estimation result from numerical linear algebra with the conjugate gradient algorithm.
On several simulations, our method scales better than existing approaches in computation time and memory.
arXiv Detail & Related papers (2022-02-25T16:35:26Z) - Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.
In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph.
Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z) - Smooth Bilevel Programming for Sparse Regularization [5.177947445379688]
Iteratively reweighted least square (IRLS) is a popular approach to solve sparsity-enforcing regression problems in machine learning.
We show how a surprisingly reparametrization of IRLS, coupled with a bilevel scheme, achieves topranging of sparsity.
arXiv Detail & Related papers (2021-06-02T19:18:22Z) - Hybrid Trilinear and Bilinear Programming for Aligning Partially
Overlapping Point Sets [85.71360365315128]
In many applications, we need algorithms which can align partially overlapping point sets are invariant to the corresponding corresponding RPM algorithm.
We first show that the objective is a cubic bound function. We then utilize the convex envelopes of trilinear and bilinear monomial transformations to derive its lower bound.
We next develop a branch-and-bound (BnB) algorithm which only branches over the transformation variables and runs efficiently.
arXiv Detail & Related papers (2021-01-19T04:24:23Z) - Conditional gradient methods for stochastically constrained convex
minimization [54.53786593679331]
We propose two novel conditional gradient-based methods for solving structured convex optimization problems.
The most important feature of our framework is that only a subset of the constraints is processed at each iteration.
Our algorithms rely on variance reduction and smoothing used in conjunction with conditional gradient steps, and are accompanied by rigorous convergence guarantees.
arXiv Detail & Related papers (2020-07-07T21:26:35Z) - Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.