A Newton Frank-Wolfe Method for Constrained Self-Concordant Minimization
- URL: http://arxiv.org/abs/2002.07003v1
- Date: Mon, 17 Feb 2020 15:28:31 GMT
- Title: A Newton Frank-Wolfe Method for Constrained Self-Concordant Minimization
- Authors: Deyi Liu, Volkan Cevher, Quoc Tran-Dinh
- Abstract summary: We demonstrate how to scalably solve a class of constrained self-concordant minimization problems using linear minimization oracles (LMO) over the constraint set.
We prove that the number of LMO calls of our method is nearly the same as that of the Frank-Wolfe method in the L-smooth case.
- Score: 60.90222082871258
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We demonstrate how to scalably solve a class of constrained self-concordant
minimization problems using linear minimization oracles (LMO) over the
constraint set. We prove that the number of LMO calls of our method is nearly
the same as that of the Frank-Wolfe method in the L-smooth case. Specifically,
our Newton Frank-Wolfe method uses $\mathcal{O}(\epsilon^{-\nu})$ LMO's, where
$\epsilon$ is the desired accuracy and $\nu:= 1 + o(1)$. In addition, we
demonstrate how our algorithm can exploit the improved variants of the
LMO-based schemes, including away-steps, to attain linear convergence rates. We
also provide numerical evidence with portfolio design with the competitive
ratio, D-optimal experimental design, and logistic regression with the elastic
net where Newton Frank-Wolfe outperforms the state-of-the-art.
Related papers
- Settling the Sample Complexity of Online Reinforcement Learning [92.02082223856479]
We show how to achieve minimax-optimal regret without incurring any burn-in cost.
We extend our theory to unveil the influences of problem-dependent quantities like the optimal value/cost and certain variances.
arXiv Detail & Related papers (2023-07-25T15:42:11Z) - A Multistep Frank-Wolfe Method [2.806911268410107]
We study the zig-zagging phenomenon in the Frank-Wolfe method as an artifact of discretization.
We propose multistep Frank-Wolfe variants where the truncation errors decay as $O(Deltap)$, where $p$ is the method's order.
arXiv Detail & Related papers (2022-10-14T21:12:01Z) - Using Taylor-Approximated Gradients to Improve the Frank-Wolfe Method
for Empirical Risk Minimization [1.4504054468850665]
In Empirical Minimization -- Minimization -- we present a novel computational step-size approach for which we have computational guarantees.
We show that our methods exhibit very significant problems on realworld binary datasets.
We also present a novel adaptive step-size approach for which we have computational guarantees.
arXiv Detail & Related papers (2022-08-30T00:08:37Z) - Minimax Optimal Quantization of Linear Models: Information-Theoretic
Limits and Efficient Algorithms [59.724977092582535]
We consider the problem of quantizing a linear model learned from measurements.
We derive an information-theoretic lower bound for the minimax risk under this setting.
We show that our method and upper-bounds can be extended for two-layer ReLU neural networks.
arXiv Detail & Related papers (2022-02-23T02:39:04Z) - Scalable Frank-Wolfe on Generalized Self-concordant Functions via Simple Steps [66.88729048402082]
Generalized self-concordance is a key property present in the objective function of many learning problems.
We show improved convergence rates for various common cases, e.g., when the feasible region under consideration is uniformly convex or polyhedral.
arXiv Detail & Related papers (2021-05-28T15:26:36Z) - RNN Training along Locally Optimal Trajectories via Frank-Wolfe
Algorithm [50.76576946099215]
We propose a novel and efficient training method for RNNs by iteratively seeking a local minima on the loss surface within a small region.
We develop a novel RNN training method that, surprisingly, even with the additional cost, the overall training cost is empirically observed to be lower than back-propagation.
arXiv Detail & Related papers (2020-10-12T01:59:18Z) - Projection Efficient Subgradient Method and Optimal Nonsmooth
Frank-Wolfe Method [54.93433440034386]
We find a feasible $epsilon$-suboptimal solution using only $O(epsilon-1)$ PO calls and optimal $O(epsilon-2)$ FO calls.
Our experiments confirm that these methods achieve significant speedups over the state-of-the-art, for a problem with costly PO and LMO calls.
arXiv Detail & Related papers (2020-10-05T08:16:56Z) - Stochastic Frank-Wolfe for Constrained Finite-Sum Minimization [30.980431848476925]
We propose an algorithm for constrained finite smooth-sum minimization with a generalized linear prediction/structure.
The proposed method is simple to implement, does not require step-size tuning, and has a constant per-iteration independent of the dataset size.
We provide an implementation of all considered methods in an open-source package.
arXiv Detail & Related papers (2020-02-27T00:47:21Z) - Self-Concordant Analysis of Frank-Wolfe Algorithms [3.3598755777055374]
In a number of applications, e.g. Poisson inverse problems or quantum state tomography, the loss is given by a self-concordant (SC) function having unbounded curvature.
We use the theory of SC functions to provide a new adaptive step size for FW methods and prove global convergence rate O(1/k) after k iterations.
arXiv Detail & Related papers (2020-02-11T11:30:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.