Sparse Convex Optimization via Adaptively Regularized Hard Thresholding
- URL: http://arxiv.org/abs/2006.14571v1
- Date: Thu, 25 Jun 2020 17:16:21 GMT
- Title: Sparse Convex Optimization via Adaptively Regularized Hard Thresholding
- Authors: Kyriakos Axiotis and Maxim Sviridenko
- Abstract summary: We present a new Adaptively Regularized Hard Thresholding (ARHT) algorithm that makes significant progress on this problem.
We also provide a new analysis of OMP with Replacement (OMPR) for general $f$, under the condition $s > s* frackappa24$.
- Score: 17.60502131429094
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The goal of Sparse Convex Optimization is to optimize a convex function $f$
under a sparsity constraint $s\leq s^*\gamma$, where $s^*$ is the target number
of non-zero entries in a feasible solution (sparsity) and $\gamma\geq 1$ is an
approximation factor. There has been a lot of work to analyze the sparsity
guarantees of various algorithms (LASSO, Orthogonal Matching Pursuit (OMP),
Iterative Hard Thresholding (IHT)) in terms of the Restricted Condition Number
$\kappa$. The best known algorithms guarantee to find an approximate solution
of value $f(x^*)+\epsilon$ with the sparsity bound of $\gamma =
O\left(\kappa\min\left\{\log \frac{f(x^0)-f(x^*)}{\epsilon},
\kappa\right\}\right)$, where $x^*$ is the target solution. We present a new
Adaptively Regularized Hard Thresholding (ARHT) algorithm that makes
significant progress on this problem by bringing the bound down to
$\gamma=O(\kappa)$, which has been shown to be tight for a general class of
algorithms including LASSO, OMP, and IHT. This is achieved without significant
sacrifice in the runtime efficiency compared to the fastest known algorithms.
We also provide a new analysis of OMP with Replacement (OMPR) for general $f$,
under the condition $s > s^* \frac{\kappa^2}{4}$, which yields Compressed
Sensing bounds under the Restricted Isometry Property (RIP). When compared to
other Compressed Sensing approaches, it has the advantage of providing a strong
tradeoff between the RIP condition and the solution sparsity, while working for
any general function $f$ that meets the RIP condition.
Related papers
- Misspecified $Q$-Learning with Sparse Linear Function Approximation: Tight Bounds on Approximation Error [25.777423855881878]
We show a novel elimination-based algorithm to show one can obtain an $Oleft(Hepsilonright)$-optimal policy.
We complement our upper bound with an $widetildeOmegaleft(Hepsilonright)$-optimality lower bound, giving a complete picture of this problem.
arXiv Detail & Related papers (2024-07-18T15:58:04Z) - Lower Bounds on the Worst-Case Complexity of Efficient Global
Optimization [11.523746174066702]
We derive a unified lower bound for the complexity of efficient global optimization in terms of the metric entropy of a ball in its corresponding reproducing kernel Hilbert space(RKHS)
We show that this lower bound nearly matches the upper bound attained by non-adaptive search algorithms for the commonly used squared exponential kernel and the Mat'ern kernel.
arXiv Detail & Related papers (2022-09-20T11:57:13Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - Iterative Hard Thresholding with Adaptive Regularization: Sparser
Solutions Without Sacrificing Runtime [17.60502131429094]
We propose a simple modification to the iterative hard thresholding algorithm, which recoversally sparser solutions as a function of the condition number.
Our proposed algorithm, regularized IHT, returns a solution with sparsity $O(skappa)$.
Our algorithm significantly improves over ARHT which also finds a solution of sparsity $O(skappa)$.
arXiv Detail & Related papers (2022-04-11T19:33:15Z) - Nearly Optimal Policy Optimization with Stable at Any Time Guarantee [53.155554415415445]
Policy-based method in citetshani 2020optimistic is only $tildeO(sqrtSAH3K + sqrtAH4K)$ where $S$ is the number of states, $A$ is the number of actions, $H$ is the horizon, and $K$ is the number of episodes, and there is a $sqrtSH$ gap compared with the information theoretic lower bound $tildeOmega(sqrtSAH
arXiv Detail & Related papers (2021-12-21T01:54:17Z) - Gaussian Process Bandit Optimization with Few Batches [49.896920704012395]
We introduce a batch algorithm inspired by finite-arm bandit algorithms.
We show that it achieves the cumulative regret upper bound $Oast(sqrtTgamma_T)$ using $O(loglog T)$ batches within time horizon $T$.
In addition, we propose a modified version of our algorithm, and characterize how the regret is impacted by the number of batches.
arXiv Detail & Related papers (2021-10-15T00:54:04Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - Local Search Algorithms for Rank-Constrained Convex Optimization [7.736462653684946]
We propose greedy and local search algorithms for rank-constrained convex optimization.
We show that if the rank-restricted condition number of $R$ is $kappa$, a solution $A$ with rank $O(r*cdot minkappa log fracR(mathbf0)-R(A*)epsilon, kappa2)$ and $R(A) leq R(A*) + epsilon$ can be recovered, where $A
arXiv Detail & Related papers (2021-01-15T18:52:02Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Agnostic Q-learning with Function Approximation in Deterministic
Systems: Tight Bounds on Approximation Error and Sample Complexity [94.37110094442136]
We study the problem of agnostic $Q$-learning with function approximation in deterministic systems.
We show that if $delta = Oleft(rho/sqrtdim_Eright)$, then one can find the optimal policy using $Oleft(dim_Eright)$.
arXiv Detail & Related papers (2020-02-17T18:41:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.