Coordinate Methods for Matrix Games
- URL: http://arxiv.org/abs/2009.08447v1
- Date: Thu, 17 Sep 2020 17:55:03 GMT
- Title: Coordinate Methods for Matrix Games
- Authors: Yair Carmon, Yujia Jin, Aaron Sidford, Kevin Tian
- Abstract summary: We develop primal-dual coordinate methods for solving bilinear saddle-point problems of the form $min_x in mathcalX max_yinmathY ytop A x$.
Our methods push existing sublinear methods towards their limits in terms of per-iteration complexity and sample complexity.
- Score: 41.821881312775496
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop primal-dual coordinate methods for solving bilinear saddle-point
problems of the form $\min_{x \in \mathcal{X}} \max_{y\in\mathcal{Y}} y^\top A
x$ which contain linear programming, classification, and regression as special
cases. Our methods push existing fully stochastic sublinear methods and
variance-reduced methods towards their limits in terms of per-iteration
complexity and sample complexity. We obtain nearly-constant per-iteration
complexity by designing efficient data structures leveraging Taylor
approximations to the exponential and a binomial heap. We improve sample
complexity via low-variance gradient estimators using dynamic sampling
distributions that depend on both the iterates and the magnitude of the matrix
entries.
Our runtime bounds improve upon those of existing primal-dual methods by a
factor depending on sparsity measures of the $m$ by $n$ matrix $A$. For
example, when rows and columns have constant $\ell_1/\ell_2$ norm ratios, we
offer improvements by a factor of $m+n$ in the fully stochastic setting and
$\sqrt{m+n}$ in the variance-reduced setting. We apply our methods to
computational geometry problems, i.e. minimum enclosing ball, maximum inscribed
ball, and linear regression, and obtain improved complexity bounds. For linear
regression with an elementwise nonnegative matrix, our guarantees improve on
exact gradient methods by a factor of $\sqrt{\mathrm{nnz}(A)/(m+n)}$.
Related papers
- Inverting the Leverage Score Gradient: An Efficient Approximate Newton Method [10.742859956268655]
This paper aims to recover the intrinsic model parameters given the leverage scores gradient.
We specifically scrutinize the inversion of the leverage score gradient, denoted as $g(x)$.
arXiv Detail & Related papers (2024-08-21T01:39:42Z) - Multi-block-Single-probe Variance Reduced Estimator for Coupled
Compositional Optimization [49.58290066287418]
We propose a novel method named Multi-block-probe Variance Reduced (MSVR) to alleviate the complexity of compositional problems.
Our results improve upon prior ones in several aspects, including the order of sample complexities and dependence on strongity.
arXiv Detail & Related papers (2022-07-18T12:03:26Z) - Hardness and Algorithms for Robust and Sparse Optimization [17.842787715567436]
We explore algorithms and limitations for sparse optimization problems such as sparse linear regression and robust linear regression.
Specifically, the sparse linear regression problem seeks a $k$-sparse vector $xinmathbbRd$ to minimize $|Ax-b|$.
The robust linear regression problem seeks a set $S$ that ignores at most $k$ rows and a vector $x$ to minimize $|(Ax-b)_S|$.
arXiv Detail & Related papers (2022-06-29T01:40:38Z) - Optimal Gradient Sliding and its Application to Distributed Optimization
Under Similarity [121.83085611327654]
We structured convex optimization problems with additive objective $r:=p + q$, where $r$ is $mu$-strong convex similarity.
We proposed a method to solve problems master to agents' communication and local calls.
The proposed method is much sharper than the $mathcalO(sqrtL_q/mu)$ method.
arXiv Detail & Related papers (2022-05-30T14:28:02Z) - Statistical Inference of Constrained Stochastic Optimization via Sketched Sequential Quadratic Programming [53.63469275932989]
We consider online statistical inference of constrained nonlinear optimization problems.
We apply the Sequential Quadratic Programming (StoSQP) method to solve these problems.
arXiv Detail & Related papers (2022-05-27T00:34:03Z) - Fast and Near-Optimal Diagonal Preconditioning [46.240079312553796]
We show how to best improve $mathbfA$'s condition number by left or right diagonal rescaling.
We give a solver for structured mixed packing and covering semidefinite programs which computes a constant-factor optimal scaling for $mathbfA$ in $widetildeO(textnnz(mathbfA) cdot textpoly(kappastar))$ time.
arXiv Detail & Related papers (2020-08-04T17:53:28Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - FANOK: Knockoffs in Linear Time [73.5154025911318]
We describe a series of algorithms that efficiently implement Gaussian model-X knockoffs to control the false discovery rate on large scale feature selection problems.
We test our methods on problems with $p$ as large as $500,000$.
arXiv Detail & Related papers (2020-06-15T21:55:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.