Provably Convergent Federated Trilevel Learning
- URL: http://arxiv.org/abs/2312.11835v2
- Date: Mon, 22 Jan 2024 02:48:48 GMT
- Title: Provably Convergent Federated Trilevel Learning
- Authors: Yang Jiao, Kai Yang, Tiancheng Wu, Chengtao Jian, Jianwei Huang
- Abstract summary: Trilevel learning, also called trilevel optimization (TLO), has been recognized as a powerful modelling tool for decision process.
This paper proposes an asynchronous federated trilevel optimization method to solve TLO problems.
- Score: 19.15328400013401
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Trilevel learning, also called trilevel optimization (TLO), has been
recognized as a powerful modelling tool for hierarchical decision process and
widely applied in many machine learning applications, such as robust neural
architecture search, hyperparameter optimization, and domain adaptation.
Tackling TLO problems has presented a great challenge due to their nested
decision-making structure. In addition, existing works on TLO face the
following key challenges: 1) they all focus on the non-distributed setting,
which may lead to privacy breach; 2) they do not offer any non-asymptotic
convergence analysis which characterizes how fast an algorithm converges. To
address the aforementioned challenges, this paper proposes an asynchronous
federated trilevel optimization method to solve TLO problems. The proposed
method utilizes $\mu$-cuts to construct a hyper-polyhedral approximation for
the TLO problem and solve it in an asynchronous manner. We demonstrate that the
proposed $\mu$-cuts are applicable to not only convex functions but also a wide
range of non-convex functions that meet the $\mu$-weakly convex assumption.
Furthermore, we theoretically analyze the non-asymptotic convergence rate for
the proposed method by showing its iteration complexity to obtain
$\epsilon$-stationary point is upper bounded by
$\mathcal{O}(\frac{1}{\epsilon^2})$. Extensive experiments on real-world
datasets have been conducted to elucidate the superiority of the proposed
method, e.g., it has a faster convergence rate with a maximum acceleration of
approximately 80$\%$.
Related papers
- Stochastic Momentum Methods for Non-smooth Non-Convex Finite-Sum Coupled Compositional Optimization [64.99236464953032]
We propose a new state-of-the-art complexity of $O(/epsilon)$ for finding an (nearly) $'level KKT solution.<n>By applying our hinge-of-the-art complexity of $O(/epsilon)$ for finding an (nearly) $'level KKT solution, we achieve a new state-of-the-art complexity of $O(/epsilon)$ for finding an (nearly) $'level KKT solution.
arXiv Detail & Related papers (2025-06-03T06:31:59Z) - Learning based convex approximation for constrained parametric optimization [11.379408842026981]
We propose an input neural network (ICNN)-based self-supervised learning framework to solve constrained optimization problems.<n>We provide rigorous convergence analysis, showing that the framework converges to a Karush-Kuhn-Tucker (KKT) approximation point of the original problem.<n>Our approach achieves a superior balance among accuracy, feasibility, and computational efficiency.
arXiv Detail & Related papers (2025-05-07T00:33:14Z) - Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture.<n>Non-smooth regularization is often incorporated into machine learning tasks.<n>We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z) - Unlocking TriLevel Learning with Level-Wise Zeroth Order Constraints: Distributed Algorithms and Provable Non-Asymptotic Convergence [9.95894026392039]
An effective distributed trilevel zeroth order learning framework DTZO is proposed in this work.
DTZO is versatile and can be adapted to a wide range of (grey-box) TLL problems with partial zeroth order constraints.
Extensive experiments have been conducted to demonstrate and validate the superior performance of the proposed DTZO.
arXiv Detail & Related papers (2024-12-10T02:57:35Z) - Optimal Hessian/Jacobian-Free Nonconvex-PL Bilevel Optimization [25.438298531555468]
Bilevel optimization is widely applied in many machine learning tasks such as hyper learning, meta learning and reinforcement learning.
We propose an efficient Hessian/BiO method with the optimal convergence $frac1TT) under some mild conditions.
We conduct some some experiments on the bilevel game hyper-stationary numerical convergence.
arXiv Detail & Related papers (2024-07-25T07:25:06Z) - Adaptive Variance Reduction for Stochastic Optimization under Weaker Assumptions [26.543628010637036]
We introduce a novel adaptive reduction method that achieves an optimal convergence rate of $mathcalO(log T)$ for non- functions.
We also extend the proposed technique to obtain the same optimal rate of $mathcalO(log T)$ for compositional optimization.
arXiv Detail & Related papers (2024-06-04T04:39:51Z) - An Optimization-based Deep Equilibrium Model for Hyperspectral Image
Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem.
A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network.
The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z) - Stochastic Inexact Augmented Lagrangian Method for Nonconvex Expectation
Constrained Optimization [88.0031283949404]
Many real-world problems have complicated non functional constraints and use a large number of data points.
Our proposed method outperforms an existing method with the previously best-known result.
arXiv Detail & Related papers (2022-12-19T14:48:54Z) - A Variance-Reduced Stochastic Gradient Tracking Algorithm for
Decentralized Optimization with Orthogonality Constraints [7.028225540638832]
We propose a novel algorithm for decentralized optimization with orthogonality constraints.
VRSGT is the first algorithm for decentralized optimization with orthogonality constraints that reduces both sampling and communication complexities simultaneously.
In the numerical experiments, VRGTS has a promising performance in a real-world autonomous sample.
arXiv Detail & Related papers (2022-08-29T14:46:44Z) - Faster Algorithm and Sharper Analysis for Constrained Markov Decision
Process [56.55075925645864]
The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints.
A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual.
This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
arXiv Detail & Related papers (2021-10-20T02:57:21Z) - Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks.
We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution.
We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z) - Efficient Methods for Structured Nonconvex-Nonconcave Min-Max
Optimization [98.0595480384208]
We propose a generalization extraient spaces which converges to a stationary point.
The algorithm applies not only to general $p$-normed spaces, but also to general $p$-dimensional vector spaces.
arXiv Detail & Related papers (2020-10-31T21:35:42Z) - A Feasible Level Proximal Point Method for Nonconvex Sparse Constrained
Optimization [25.73397307080647]
We present a new model of a general convex or non objective machine machine objectives.
We propose an algorithm that solves a constraint with gradually relaxed point levels of each subproblem.
We demonstrate the effectiveness of our new numerical scale problems.
arXiv Detail & Related papers (2020-10-23T05:24:05Z) - Fast Objective & Duality Gap Convergence for Non-Convex Strongly-Concave
Min-Max Problems with PL Condition [52.08417569774822]
This paper focuses on methods for solving smooth non-concave min-max problems, which have received increasing attention due to deep learning (e.g., deep AUC)
arXiv Detail & Related papers (2020-06-12T00:32:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.