Multi-block Min-max Bilevel Optimization with Applications in Multi-task
Deep AUC Maximization
- URL: http://arxiv.org/abs/2206.00260v1
- Date: Wed, 1 Jun 2022 06:42:36 GMT
- Title: Multi-block Min-max Bilevel Optimization with Applications in Multi-task
Deep AUC Maximization
- Authors: Quanqi Hu, Yongjian Zhong, Tianbao Yang
- Abstract summary: We present a single-loop algorithm to tackle a multiblock min-max bilevel optimization problem.
We show that our algorithm can be used to tackle problems with hundreds of tasks.
- Score: 36.743632418094
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we study multi-block min-max bilevel optimization problems,
where the upper level is non-convex strongly-concave minimax objective and the
lower level is a strongly convex objective, and there are multiple blocks of
dual variables and lower level problems. Due to the intertwined multi-block
min-max bilevel structure, the computational cost at each iteration could be
prohibitively high, especially with a large number of blocks. To tackle this
challenge, we present a single-loop randomized stochastic algorithm, which
requires updates for only a constant number of blocks at each iteration. Under
some mild assumptions on the problem, we establish its sample complexity of
$\mathcal{O}(1/\epsilon^4)$ for finding an $\epsilon$-stationary point. This
matches the optimal complexity for solving stochastic nonconvex optimization
under a general unbiased stochastic oracle model. Moreover, we provide two
applications of the proposed method in multi-task deep AUC (area under ROC
curve) maximization and multi-task deep partial AUC maximization. Experimental
results validate our theory and demonstrate the effectiveness of our method on
problems with hundreds of tasks.
Related papers
- A First-Order Multi-Gradient Algorithm for Multi-Objective Bi-Level Optimization [7.097069899573992]
We study the Multi-Objective Bi-Level Optimization (MOBLO) problem.
Existing gradient-based MOBLO algorithms need to compute the Hessian matrix.
We propose an efficient first-order multi-gradient method for MOBLO, called FORUM.
arXiv Detail & Related papers (2024-01-17T15:03:37Z) - Blockwise Stochastic Variance-Reduced Methods with Parallel Speedup for
Multi-Block Bilevel Optimization [43.74656748515853]
Non-stationary multi-block bilevel optimization problems involve $mgg 1$ lower level problems and have important applications in machine learning.
We aim to achieve three properties for our algorithm: a) matching the state-of-the-art complexity of standard BO problems with a single block; (b) achieving parallel speedup by sampling $I$ samples for each sampled block per-iteration; and (c) avoiding the computation of the inverse of a high-dimensional Hessian matrix estimator.
arXiv Detail & Related papers (2023-05-30T04:10:11Z) - A Lower Bound and a Near-Optimal Algorithm for Bilevel Empirical Risk
Minimization [23.401092942765196]
We propose a bilevel extension of the celebrated SARAH algorithm.
We demonstrate that the algorithm requires $mathcalO((n+m)frac12varepsilon-1)$ oracle calls to achieve $varepsilon$-stationarity.
We provide a lower bound on the number of oracle calls required to get an approximate stationary point of the objective function of the bilevel problem.
arXiv Detail & Related papers (2023-02-17T09:04:18Z) - Functional Constrained Optimization for Risk Aversion and Sparsity
Control [7.561780884831967]
Risk and sparsity requirements need to be enforced simultaneously in many applications, e.g., in portfolio optimization, assortment planning, and radiation planning.
We propose a Level Conditional Gradient (LCG) method, which generates a convex or sparse trajectory for these challenges.
We show that the method achieves a level single-set projection of the optimal value an inner conditional approximation (CGO) for solving mini-max sub gradient.
arXiv Detail & Related papers (2022-10-11T02:51:51Z) - Min-Max Bilevel Multi-objective Optimization with Applications in
Machine Learning [30.25074797092709]
This paper is the first to propose a min-max bilevel multi-objective optimization framework.
It highlights applications in representation learning and hyper-objective learning.
arXiv Detail & Related papers (2022-03-03T18:56:13Z) - A Constrained Optimization Approach to Bilevel Optimization with
Multiple Inner Minima [49.320758794766185]
We propose a new approach, which convert the bilevel problem to an equivalent constrained optimization, and then the primal-dual algorithm can be used to solve the problem.
Such an approach enjoys a few advantages including (a) addresses the multiple inner minima challenge; (b) fully first-order efficiency without Jacobian computations.
arXiv Detail & Related papers (2022-03-01T18:20:01Z) - Minimax Optimization with Smooth Algorithmic Adversaries [59.47122537182611]
We propose a new algorithm for the min-player against smooth algorithms deployed by an adversary.
Our algorithm is guaranteed to make monotonic progress having no limit cycles, and to find an appropriate number of gradient ascents.
arXiv Detail & Related papers (2021-06-02T22:03:36Z) - Randomized Stochastic Variance-Reduced Methods for Stochastic Bilevel
Optimization [62.87181271021217]
We consider non-SBO problems that have many applications in machine learning.
This paper proposes fast randomized algorithms for non-SBO problems.
arXiv Detail & Related papers (2021-05-05T18:28:42Z) - A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm
for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems.
MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z) - A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis
and Application to Actor-Critic [142.1492359556374]
Bilevel optimization is a class of problems which exhibit a two-level structure.
We propose a two-timescale approximation (TTSA) algorithm for tackling such a bilevel problem.
We show that a two-timescale natural actor-critic policy optimization algorithm can be viewed as a special case of our TTSA framework.
arXiv Detail & Related papers (2020-07-10T05:20:02Z) - Fast Objective & Duality Gap Convergence for Non-Convex Strongly-Concave
Min-Max Problems with PL Condition [52.08417569774822]
This paper focuses on methods for solving smooth non-concave min-max problems, which have received increasing attention due to deep learning (e.g., deep AUC)
arXiv Detail & Related papers (2020-06-12T00:32:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.