Fugu-MT 論文翻訳(概要): Alternating Differentiation for Optimization Layers

論文の概要: Alternating Differentiation for Optimization Layers

arxiv url: http://arxiv.org/abs/2210.01802v2
Date: Mon, 24 Apr 2023 06:33:10 GMT
ステータス: 翻訳完了
システム内更新日: 2023-04-25 23:36:32.371384
Title: Alternating Differentiation for Optimization Layers
Title（参考訳）: 最適化層における交互微分
Authors: Haixiang Sun, Ye Shi, Jingya Wang, Hoang Duong Tuan, H. Vincent Poor, and Dacheng Tao
Abstract要約: そこで我々は,最適化問題を識別するAlternating Differentiation (Alt-Diff) という新しいフレームワークを開発した。 Alt-Diff はヤコビ行列の次元を特に大規模制約のある最適化のために著しく減少させることを示す。また,Alt-Diffを切断して計算速度をさらに高速化することを提案する。
参考スコア（独自算出の注目度）: 133.2668019610731
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The idea of embedding optimization problems into deep neural networks as optimization layers to encode constraints and inductive priors has taken hold in recent years. Most existing methods focus on implicitly differentiating Karush-Kuhn-Tucker (KKT) conditions in a way that requires expensive computations on the Jacobian matrix, which can be slow and memory-intensive. In this paper, we developed a new framework, named Alternating Differentiation (Alt-Diff), that differentiates optimization problems (here, specifically in the form of convex optimization problems with polyhedral constraints) in a fast and recursive way. Alt-Diff decouples the differentiation procedure into a primal update and a dual update in an alternating way. Accordingly, Alt-Diff substantially decreases the dimensions of the Jacobian matrix especially for optimization with large-scale constraints and thus increases the computational speed of implicit differentiation. We show that the gradients obtained by Alt-Diff are consistent with those obtained by differentiating KKT conditions. In addition, we propose to truncate Alt-Diff to further accelerate the computational speed. Under some standard assumptions, we show that the truncation error of gradients is upper bounded by the same order of variables' estimation error. Therefore, Alt-Diff can be truncated to further increase computational speed without sacrificing much accuracy. A series of comprehensive experiments validate the superiority of Alt-Diff.
Abstract（参考訳）: 近年では、制約や帰納的優先順位を符号化する最適化層として、ディープニューラルネットワークに最適化問題を組み込むアイデアが定着している。既存の手法のほとんどは、遅くてメモリ集約的なヤコビ行列上の高価な計算を必要とする方法でKKT条件を暗黙的に微分することに焦点を当てている。本稿では,最適化問題(特に多面体制約を伴う凸最適化問題)を高速かつ再帰的に区別する,交互微分(alt-diff)という新しい枠組みを開発した。 Alt-Diffは、分化手順を原始更新と二重更新に交互に分離する。したがって、Alt-Diffは特に大規模制約を伴う最適化のためにヤコビ行列の次元を著しく減少させ、暗黙の微分の計算速度を増大させる。 alt-diffにより得られた勾配はkkt条件の微分によって得られた勾配と一致することを示す。さらに,Alt-Diffを切断して計算速度をさらに高速化することを提案する。いくつかの標準的な仮定の下では、勾配のトランケーション誤差は変数の推定誤差の同じ順序で上界であることが示される。したがって、Alt-Diffは精度を犠牲にすることなく計算速度をさらに向上させることができる。一連の総合的な実験は、Alt-Diffの優越性を検証した。

関連論文リスト

An Accelerated Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness [15.656614304616006]
本稿では,上層関数が非非有界な滑らかさであり,下層関数が強く凸であるような二層最適化問題のクラスについて検討する。これらの問題は、ニューラルネットワークを用いたテキスト分類など、データ学習に大きな応用がある。
論文参考訳（メタデータ） (2024-09-28T02:30:44Z)
Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity [50.25258834153574]
我々は、(強に)凸 $(L0)$-smooth 関数のクラスに焦点を当て、いくつかの既存のメソッドに対する新しい収束保証を導出する。特に,スムーズなグラディエント・クリッピングを有するグラディエント・ディフレッシュと,ポリアク・ステップサイズを有するグラディエント・ディフレッシュのコンバージェンス・レートの改善を導出した。
論文参考訳（メタデータ） (2024-09-23T13:11:37Z)
Directional Smoothness and Gradient Methods: Convergence and Adaptivity [16.779513676120096]
我々は、最適化の経路に沿った目的の条件付けに依存する勾配降下に対する新しい準最適境界を開発する。我々の証明の鍵となるのは方向の滑らかさであり、これは、目的の上のバウンドを開発するために使用する勾配変動の尺度である。我々は,方向の滑らかさの知識を使わずとも,ポリアクのステップサイズと正規化GDが高速で経路依存の速度を得ることを示した。
論文参考訳（メタデータ） (2024-03-06T22:24:05Z)
ELRA: Exponential learning rate adaption gradient descent optimization method [83.88591755871734]
我々は, 高速(指数率), ab initio(超自由)勾配に基づく適応法を提案する。本手法の主な考え方は,状況認識による$alphaの適応である。これは任意の次元 n の問題に適用でき、線型にしかスケールできない。
論文参考訳（メタデータ） (2023-09-12T14:36:13Z)
Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
我々は、制約付き最適化のための一階アルゴリズムと非滑らかなシステムの間で、新しい一階アルゴリズムのクラスを設計する。これらのアルゴリズムの重要な性質は、制約がスパース変数の代わりに速度で表されることである。
論文参考訳（メタデータ） (2023-02-01T08:50:48Z)
Proximal Subgradient Norm Minimization of ISTA and FISTA [8.261388753972234]
反復収縮保持アルゴリズムのクラスに対する2乗近位次数ノルムは逆2乗率で収束することを示す。また、高速反復収縮保持アルゴリズム (FISTA) のクラスに対する2乗次次数次ノルムが、逆立方レートで収束するように加速されることも示している。
論文参考訳（メタデータ） (2022-11-03T06:50:19Z)
Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization [50.83356836818667]
勾配ランゲヴィン・ダイナミクスは非エプス最適化問題を解くための最も基本的なアルゴリズムの1つである。本稿では、このタイプの2つの変種、すなわち、分散還元ランジュバンダイナミクスと再帰勾配ランジュバンダイナミクスを示す。
論文参考訳（メタデータ） (2022-03-30T11:39:00Z)
Implicit differentiation for fast hyperparameter selection in non-smooth convex learning [87.60600646105696]
内部最適化問題が凸であるが非滑らかである場合の一階法を研究する。本研究では, ヤコビアンの近位勾配降下と近位座標降下収率列の前方モード微分が, 正確なヤコビアンに向かって収束していることを示す。
論文参考訳（メタデータ） (2021-05-04T17:31:28Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。