Optimization-Derived Learning with Essential Convergence Analysis of
Training and Hyper-training
- URL: http://arxiv.org/abs/2206.07875v1
- Date: Thu, 16 Jun 2022 01:50:25 GMT
- Title: Optimization-Derived Learning with Essential Convergence Analysis of
Training and Hyper-training
- Authors: Risheng Liu, Xuan Liu, Shangzhi Zeng, Jin Zhang and Yixuan Zhang
- Abstract summary: We design a Generalized Krasnoselskii-Mann (GKM) scheme based on fixed-point iterations as our fundamental ODL module.
Under the GKM scheme, a Bilevel Meta Optimization (BMO) algorithmic framework is constructed to solve the optimal training and hyper-training variables together.
- Score: 52.39882976848064
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Recently, Optimization-Derived Learning (ODL) has attracted attention from
learning and vision areas, which designs learning models from the perspective
of optimization. However, previous ODL approaches regard the training and
hyper-training procedures as two separated stages, meaning that the
hyper-training variables have to be fixed during the training process, and thus
it is also impossible to simultaneously obtain the convergence of training and
hyper-training variables. In this work, we design a Generalized
Krasnoselskii-Mann (GKM) scheme based on fixed-point iterations as our
fundamental ODL module, which unifies existing ODL methods as special cases.
Under the GKM scheme, a Bilevel Meta Optimization (BMO) algorithmic framework
is constructed to solve the optimal training and hyper-training variables
together. We rigorously prove the essential joint convergence of the
fixed-point iteration for training and the process of optimizing
hyper-parameters for hyper-training, both on the approximation quality, and on
the stationary analysis. Experiments demonstrate the efficiency of BMO with
competitive performance on sparse coding and real-world applications such as
image deconvolution and rain streak removal.
Related papers
- A Multi-Level Framework for Accelerating Training Transformer Models [5.268960238774481]
Training large-scale deep learning models poses an unprecedented demand for computing power.
We propose a multi-level framework for training acceleration based on Coalescing, De-coalescing and Interpolation.
We prove that the proposed framework reduces the computational cost by about 20% on training BERT/GPT-Base models and up to 51.6% on training the BERT-Large model.
arXiv Detail & Related papers (2024-04-07T03:04:34Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - BiERL: A Meta Evolutionary Reinforcement Learning Framework via Bilevel
Optimization [34.24884427152513]
We propose a general meta ERL framework via bilevel optimization (BiERL)
We design an elegant meta-level architecture that embeds the inner-level's evolving experience into an informative population representation.
We perform extensive experiments in MuJoCo and Box2D tasks to verify that as a general framework, BiERL outperforms various baselines and consistently improves the learning performance for a diversity of ERL algorithms.
arXiv Detail & Related papers (2023-08-01T09:31:51Z) - End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z) - Hierarchical Optimization-Derived Learning [58.69200830655009]
We establish a new framework, named Hierarchical ODL (HODL), to simultaneously investigate the intrinsic behaviors of optimization-derived model construction and its corresponding learning process.
This is the first theoretical guarantee for these two coupled ODL components: optimization and learning.
arXiv Detail & Related papers (2023-02-11T03:35:13Z) - Self-Supervised Primal-Dual Learning for Constrained Optimization [19.965556179096385]
This paper studies how to train machine-learning models that directly approximate the optimal solutions of constrained optimization problems.
It proposes the idea of Primal-Dual Learning (PDL), a self-supervised training method that does not require a set of pre-solved instances or an optimization solver for training and inference.
arXiv Detail & Related papers (2022-08-18T20:07:10Z) - Accelerated Federated Learning with Decoupled Adaptive Optimization [53.230515878096426]
federated learning (FL) framework enables clients to collaboratively learn a shared model while keeping privacy of training data on clients.
Recently, many iterations efforts have been made to generalize centralized adaptive optimization methods, such as SGDM, Adam, AdaGrad, etc., to federated settings.
This work aims to develop novel adaptive optimization methods for FL from the perspective of dynamics of ordinary differential equations (ODEs)
arXiv Detail & Related papers (2022-07-14T22:46:43Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z) - Multi-level Training and Bayesian Optimization for Economical
Hyperparameter Optimization [12.92634461859467]
In this paper, we develop an effective approach to reducing the total amount of required training time for Hyperparameter Optimization.
We propose a truncated additive Gaussian process model to calibrate approximate performance measurements generated by light training.
Based on the model, a sequential model-based algorithm is developed to generate the performance profile of the configuration space as well as find optimal ones.
arXiv Detail & Related papers (2020-07-20T09:03:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.