Related papers: Min-Max Bilevel Multi-objective Optimization with Applications in Machine Learning

Min-Max Bilevel Multi-objective Optimization with Applications in Machine Learning

URL: http://arxiv.org/abs/2203.01924v1
Date: Thu, 3 Mar 2022 18:56:13 GMT
Title: Min-Max Bilevel Multi-objective Optimization with Applications in Machine Learning
Authors: Alex Gu, Songtao Lu, Parikshit Ram, Lily Weng
Abstract summary: This paper is the first to propose a min-max bilevel multi-objective optimization framework. It highlights applications in representation learning and hyper-objective learning.
Score: 30.25074797092709
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper is the first to propose a generic min-max bilevel multi-objective optimization framework, highlighting applications in representation learning and hyperparameter optimization. In many machine learning applications such as meta-learning, multi-task learning, and representation learning, a subset of the parameters are shared by all the tasks, while each specific task has its own set of additional parameters. By leveraging the recent advances of nonconvex min-max optimization, we propose a gradient descent-ascent bilevel optimization (MORBiT) algorithm which is able to extract a set of shared parameters that is robust over all tasks and further overcomes the distributional shift between training and testing tasks. Theoretical analyses show that MORBiT converges to the first-order stationary point at a rate of $\mathcal{O}(\sqrt{n}K^{-2/5})$ for a class of nonconvex problems, where $K$ denotes the total number of iterations and $n$ denotes the number of tasks. Overall, we formulate a min-max bilevel multi-objective optimization problem, provide a single loop two-timescale algorithm with convergence rate guarantees, and show theoretical bounds on the generalization abilities of the optimizer. Experimental results on sinusoid regression and representation learning showcase the superiority of MORBiT over state-of-the-art methods, validating our convergence and generalization results.

Related papers

Scalable Min-Max Optimization via Primal-Dual Exact Pareto Optimization [66.51747366239299]
We propose a smooth variant of the min-max problem based on the augmented Lagrangian. The proposed algorithm scales better with the number of objectives than subgradient-based strategies.
arXiv Detail & Related papers (2025-03-16T11:05:51Z)
Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency [52.60557300927007]
We present a $textbfMA-OSMA$ algorithm to transfer the discrete submodular problem into a continuous optimization. We also introduce a projection-free $textbfMA-OSEA$ algorithm, which effectively utilizes the KL divergence by mixing a uniform distribution. Our algorithms significantly improve the $(frac11+c)$-approximation provided by the state-of-the-art OSG algorithm.
arXiv Detail & Related papers (2025-02-07T15:57:56Z)
Global Optimization of Gaussian Process Acquisition Functions Using a Piecewise-Linear Kernel Approximation [2.3342885570554652]
We introduce a piecewise approximation for process kernels and a corresponding MIQP representation for acquisition functions. We empirically demonstrate the framework on synthetic functions, constrained benchmarks, and hyper tuning tasks.
arXiv Detail & Related papers (2024-10-22T10:56:52Z)
Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML. This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z)
Low-Rank Multitask Learning based on Tensorized SVMs and LSSVMs [65.42104819071444]
Multitask learning (MTL) leverages task-relatedness to enhance performance. We employ high-order tensors, with each mode corresponding to a task index, to naturally represent tasks referenced by multiple indices. We propose a general framework of low-rank MTL methods with tensorized support vector machines (SVMs) and least square support vector machines (LSSVMs)
arXiv Detail & Related papers (2023-08-30T14:28:26Z)
Multi-Task Learning with Prior Information [5.770309971945476]
We propose a multi-task learning framework, where we utilize prior knowledge about the relations between features. We also impose a penalty on the coefficients changing for each specific feature to ensure related tasks have similar coefficients on common features shared among them.
arXiv Detail & Related papers (2023-01-04T12:48:05Z)
Pareto Manifold Learning: Tackling multiple tasks via ensembles of single-task models [50.33956216274694]
In Multi-Task Learning (MTL), tasks may compete and limit the performance achieved on each other, rather than guiding the optimization to a solution. We propose textitPareto Manifold Learning, an ensembling method in weight space.
arXiv Detail & Related papers (2022-10-18T11:20:54Z)
Decentralized Gossip-Based Stochastic Bilevel Optimization over Communication Networks [42.76623191830371]
We propose a gossip-based distributed bilevel optimization algorithm. Agents can solve both networked and outer problems in a single time. Our algorithm achieves the state-of-the-art efficiency and test accuracy.
arXiv Detail & Related papers (2022-06-22T06:38:54Z)
Multi-block Min-max Bilevel Optimization with Applications in Multi-task Deep AUC Maximization [36.743632418094]
We present a single-loop algorithm to tackle a multiblock min-max bilevel optimization problem. We show that our algorithm can be used to tackle problems with hundreds of tasks.
arXiv Detail & Related papers (2022-06-01T06:42:36Z)
Bilevel Optimization for Machine Learning: Algorithm Design and Convergence Analysis [12.680169619392695]
This thesis provides a comprehensive convergence rate analysis for bilevel optimization algorithms. For the problem-based formulation, we provide a convergence rate analysis for AID- and ITD-based bilevel algorithms. We then develop acceleration bilevel algorithms, for which we provide shaper convergence analysis with relaxed assumptions.
arXiv Detail & Related papers (2021-07-31T22:05:47Z)
Minimax Optimization with Smooth Algorithmic Adversaries [59.47122537182611]
We propose a new algorithm for the min-player against smooth algorithms deployed by an adversary. Our algorithm is guaranteed to make monotonic progress having no limit cycles, and to find an appropriate number of gradient ascents.
arXiv Detail & Related papers (2021-06-02T22:03:36Z)
Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z)
Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems. We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.