Min-Max Bilevel Multi-objective Optimization with Applications in
Machine Learning
- URL: http://arxiv.org/abs/2203.01924v1
- Date: Thu, 3 Mar 2022 18:56:13 GMT
- Title: Min-Max Bilevel Multi-objective Optimization with Applications in
Machine Learning
- Authors: Alex Gu, Songtao Lu, Parikshit Ram, Lily Weng
- Abstract summary: This paper is the first to propose a min-max bilevel multi-objective optimization framework.
It highlights applications in representation learning and hyper-objective learning.
- Score: 30.25074797092709
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper is the first to propose a generic min-max bilevel multi-objective
optimization framework, highlighting applications in representation learning
and hyperparameter optimization. In many machine learning applications such as
meta-learning, multi-task learning, and representation learning, a subset of
the parameters are shared by all the tasks, while each specific task has its
own set of additional parameters. By leveraging the recent advances of
nonconvex min-max optimization, we propose a gradient descent-ascent bilevel
optimization (MORBiT) algorithm which is able to extract a set of shared
parameters that is robust over all tasks and further overcomes the
distributional shift between training and testing tasks. Theoretical analyses
show that MORBiT converges to the first-order stationary point at a rate of
$\mathcal{O}(\sqrt{n}K^{-2/5})$ for a class of nonconvex problems, where $K$
denotes the total number of iterations and $n$ denotes the number of tasks.
Overall, we formulate a min-max bilevel multi-objective optimization problem,
provide a single loop two-timescale algorithm with convergence rate guarantees,
and show theoretical bounds on the generalization abilities of the optimizer.
Experimental results on sinusoid regression and representation learning
showcase the superiority of MORBiT over state-of-the-art methods, validating
our convergence and generalization results.
Related papers
- Global Optimization of Gaussian Process Acquisition Functions Using a Piecewise-Linear Kernel Approximation [2.3342885570554652]
We introduce a piecewise approximation for process kernels and a corresponding MIQP representation for acquisition functions.
We empirically demonstrate the framework on synthetic functions, constrained benchmarks, and hyper tuning tasks.
arXiv Detail & Related papers (2024-10-22T10:56:52Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - Low-Rank Multitask Learning based on Tensorized SVMs and LSSVMs [65.42104819071444]
Multitask learning (MTL) leverages task-relatedness to enhance performance.
We employ high-order tensors, with each mode corresponding to a task index, to naturally represent tasks referenced by multiple indices.
We propose a general framework of low-rank MTL methods with tensorized support vector machines (SVMs) and least square support vector machines (LSSVMs)
arXiv Detail & Related papers (2023-08-30T14:28:26Z) - Pareto Manifold Learning: Tackling multiple tasks via ensembles of
single-task models [50.33956216274694]
In Multi-Task Learning (MTL), tasks may compete and limit the performance achieved on each other, rather than guiding the optimization to a solution.
We propose textitPareto Manifold Learning, an ensembling method in weight space.
arXiv Detail & Related papers (2022-10-18T11:20:54Z) - Decentralized Gossip-Based Stochastic Bilevel Optimization over
Communication Networks [42.76623191830371]
We propose a gossip-based distributed bilevel optimization algorithm.
Agents can solve both networked and outer problems in a single time.
Our algorithm achieves the state-of-the-art efficiency and test accuracy.
arXiv Detail & Related papers (2022-06-22T06:38:54Z) - Multi-block Min-max Bilevel Optimization with Applications in Multi-task
Deep AUC Maximization [36.743632418094]
We present a single-loop algorithm to tackle a multiblock min-max bilevel optimization problem.
We show that our algorithm can be used to tackle problems with hundreds of tasks.
arXiv Detail & Related papers (2022-06-01T06:42:36Z) - Bilevel Optimization for Machine Learning: Algorithm Design and
Convergence Analysis [12.680169619392695]
This thesis provides a comprehensive convergence rate analysis for bilevel optimization algorithms.
For the problem-based formulation, we provide a convergence rate analysis for AID- and ITD-based bilevel algorithms.
We then develop acceleration bilevel algorithms, for which we provide shaper convergence analysis with relaxed assumptions.
arXiv Detail & Related papers (2021-07-31T22:05:47Z) - Minimax Optimization with Smooth Algorithmic Adversaries [59.47122537182611]
We propose a new algorithm for the min-player against smooth algorithms deployed by an adversary.
Our algorithm is guaranteed to make monotonic progress having no limit cycles, and to find an appropriate number of gradient ascents.
arXiv Detail & Related papers (2021-06-02T22:03:36Z) - Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.