Boosting as Frank-Wolfe
- URL: http://arxiv.org/abs/2209.10831v1
- Date: Thu, 22 Sep 2022 07:36:55 GMT
- Title: Boosting as Frank-Wolfe
- Authors: Ryotaro Mitsuboshi, Kohei Hatano, Eiji Takimoto
- Abstract summary: We propose a generic boosting scheme that combines the Frank-Wolfe algorithm and any secondary algorithm.
We show that the scheme retains the same convergence guarantee as ERLPBoost and C-ERLPBoost.
- Score: 0.6875312133832078
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Some boosting algorithms, such as LPBoost, ERLPBoost, and C-ERLPBoost, aim to
solve the soft margin optimization problem with the $\ell_1$-norm
regularization. LPBoost rapidly converges to an $\epsilon$-approximate solution
in practice, but it is known to take $\Omega(m)$ iterations in the worst case,
where $m$ is the sample size. On the other hand, ERLPBoost and C-ERLPBoost are
guaranteed to converge to an $\epsilon$-approximate solution in
$O(\frac{1}{\epsilon^2} \ln \frac{m}{\nu})$ iterations. However, the
computation per iteration is very high compared to LPBoost.
To address this issue, we propose a generic boosting scheme that combines the
Frank-Wolfe algorithm and any secondary algorithm and switches one to the other
iteratively. We show that the scheme retains the same convergence guarantee as
ERLPBoost and C-ERLPBoost. One can incorporate any secondary algorithm to
improve in practice. This scheme comes from a unified view of boosting
algorithms for soft margin optimization. More specifically, we show that
LPBoost, ERLPBoost, and C-ERLPBoost are instances of the Frank-Wolfe algorithm.
In experiments on real datasets, one of the instances of our scheme exploits
the better updates of the secondary algorithm and performs comparably with
LPBoost.
Related papers
- The Many Faces of Optimal Weak-to-Strong Learning [10.985323882432086]
We present a new and surprisingly simple Boosting algorithm that obtains a provably optimal sample complexity.
Our pilot empirical study suggests that our new algorithm might outperform previous algorithms on large data sets.
arXiv Detail & Related papers (2024-08-30T09:38:51Z) - How to Boost Any Loss Function [63.573324901948716]
We show that loss functions can be efficiently optimized with boosting.
We show that boosting can achieve a feat not yet known to be possible in the classical $0th$ order setting.
arXiv Detail & Related papers (2024-07-02T14:08:23Z) - The Cost of Parallelizing Boosting [1.9235143628887907]
We study the cost of parallelizing weak-to-strong boosting algorithms for learning.
We show that even "slight" parallelization of boosting requires an exponential blow-up in the complexity of training.
arXiv Detail & Related papers (2024-02-23T07:03:52Z) - Boosting with Tempered Exponential Measures [40.961820572346426]
A popular ML algorithm, AdaBoost, can be derived from the dual of a relative entropy minimization problem.
We generalize this setup to the recently introduced it tempered exponential measures (TEMs) where normalization is enforced on a specific power of the measure and not the measure itself.
Our algorithm, $t$-AdaBoost, recovers AdaBoostas a special case ($t=1$)
arXiv Detail & Related papers (2023-06-08T18:17:48Z) - Gaussian Process Bandit Optimization with Few Batches [49.896920704012395]
We introduce a batch algorithm inspired by finite-arm bandit algorithms.
We show that it achieves the cumulative regret upper bound $Oast(sqrtTgamma_T)$ using $O(loglog T)$ batches within time horizon $T$.
In addition, we propose a modified version of our algorithm, and characterize how the regret is impacted by the number of batches.
arXiv Detail & Related papers (2021-10-15T00:54:04Z) - BiAdam: Fast Adaptive Bilevel Optimization Methods [104.96004056928474]
Bilevel optimization has attracted increased interest in machine learning due to its many applications.
We provide a useful analysis framework for both the constrained and unconstrained optimization.
arXiv Detail & Related papers (2021-06-21T20:16:40Z) - Provably Faster Algorithms for Bilevel Optimization [54.83583213812667]
Bilevel optimization has been widely applied in many important machine learning applications.
We propose two new algorithms for bilevel optimization.
We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
arXiv Detail & Related papers (2021-06-08T21:05:30Z) - Revisiting Modified Greedy Algorithm for Monotone Submodular
Maximization with a Knapsack Constraint [75.85952446237599]
We show that a modified greedy algorithm can achieve an approximation factor of $0.305$.
We derive a data-dependent upper bound on the optimum.
It can also be used to significantly improve the efficiency of such algorithms as branch and bound.
arXiv Detail & Related papers (2020-08-12T15:40:21Z) - On the Dual Formulation of Boosting Algorithms [92.74617630106559]
We show that the Lagrange problems of AdaBoost, LogitBoost and soft-marginBoost are all dual problems with generalized hinge loss entropy.
By looking at the dual problems of these boosting algorithms, we show that the success of boosting can be understood in terms of maintaining a better margin distribution.
arXiv Detail & Related papers (2009-01-23T02:14:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.