Fugu-MT 論文翻訳(概要): Learning to Generalize Provably in Learning to Optimize

論文の概要: Learning to Generalize Provably in Learning to Optimize

arxiv url: http://arxiv.org/abs/2302.11085v1
Date: Wed, 22 Feb 2023 01:17:31 GMT
ステータス: 翻訳完了
システム内更新日: 2023-02-23 16:36:40.190231
Title: Learning to Generalize Provably in Learning to Optimize
Title（参考訳）: 学習を最適化するための学習
Authors: Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, Dacheng Tao, Yingbin Liang, Zhangyang Wang
Abstract要約: 最適化学習(L2O)は、データ駆動アプローチによる最適化設計を自動化することで、人気が高まっている。現在のL2O法は、少なくとも2回は一般化性能の低下に悩まされることが多い。我々はこの2つのメトリクスを平坦性を考慮した正規化器としてL2Oフレームワークに組み込むことを提案する。
参考スコア（独自算出の注目度）: 185.71326306329678
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning to optimize (L2O) has gained increasing popularity, which automates the design of optimizers by data-driven approaches. However, current L2O methods often suffer from poor generalization performance in at least two folds: (i) applying the L2O-learned optimizer to unseen optimizees, in terms of lowering their loss function values (optimizer generalization, or ``generalizable learning of optimizers"); and (ii) the test performance of an optimizee (itself as a machine learning model), trained by the optimizer, in terms of the accuracy over unseen data (optimizee generalization, or ``learning to generalize"). While the optimizer generalization has been recently studied, the optimizee generalization (or learning to generalize) has not been rigorously studied in the L2O context, which is the aim of this paper. We first theoretically establish an implicit connection between the local entropy and the Hessian, and hence unify their roles in the handcrafted design of generalizable optimizers as equivalent metrics of the landscape flatness of loss functions. We then propose to incorporate these two metrics as flatness-aware regularizers into the L2O framework in order to meta-train optimizers to learn to generalize, and theoretically show that such generalization ability can be learned during the L2O meta-training process and then transformed to the optimizee loss function. Extensive experiments consistently validate the effectiveness of our proposals with substantially improved generalization on multiple sophisticated L2O models and diverse optimizees. Our code is available at: https://github.com/VITA-Group/Open-L2O/tree/main/Model_Free_L2O/L2O-Entropy.
Abstract（参考訳）: 最適化のための学習(l2o)が人気を集め、データ駆動アプローチによる最適化の設計が自動化されている。しかし、現在のL2O法は、少なくとも2回は一般化性能の低下に悩まされることが多い。 i) L2O 学習オプティマイザを未確認最適化に適用し、損失関数の値(最適化一般化、もしくは「最適化者の一般化可能な学習」)を下げる。 (ii)オプティマイザによって訓練されたオプティマイザ(それ自体は機械学習モデルとして)の非知覚データに対する精度(一般化の最適化、あるいは「一般化のための学習」)の試験性能近年,最適化の一般化が研究されているが,L2Oコンテキストにおいて最適化の一般化(あるいは一般化の学習)は厳密には研究されていない。まず,局所エントロピーとヘシアンの間の暗黙的な関係を理論的に確立し,それらの役割を一般化可能な最適化器のハンドクラフト設計において,損失関数のランドスケープ平坦性の等価な指標として統一する。次に、これらの2つの指標をフラットネス対応正規化器としてL2Oフレームワークに組み込んで、メタトレーニングオプティマイザの一般化を学習し、L2Oメタトレーニングプロセス中にそのような一般化能力を学習し、最適化ロス関数に変換できることを理論的に示す。複数の高度L2Oモデルの一般化と多種多様な最適化により,提案手法の有効性を一貫して検証した。私たちのコードは、https://github.com/VITA-Group/Open-L2O/tree/main/Model_Free_L2O/L2O-Entropyで利用可能です。

論文の概要: Learning to Generalize Provably in Learning to Optimize

関連論文リスト