Fugu-MT 論文翻訳(概要): Accelerated Gradient Algorithms with Adaptive Subspace Search for Instance-Faster Optimization

論文の概要: Accelerated Gradient Algorithms with Adaptive Subspace Search for Instance-Faster Optimization

arxiv url: http://arxiv.org/abs/2312.03218v1
Date: Wed, 6 Dec 2023 01:16:10 GMT
ステータス: 翻訳完了
システム内更新日: 2023-12-07 16:15:28.757907
Title: Accelerated Gradient Algorithms with Adaptive Subspace Search for Instance-Faster Optimization
Title（参考訳）: 適応部分空間探索によるインスタンス・ファスター最適化の高速化
Authors: Yuanshi Liu, Hanzhen Zhao, Yang Xu, Pengyun Yue, Cong Fang
Abstract要約: グラディエントベースのミニマックス最適アルゴリズムは、継続的最適化と機械学習の開発を促進する。本稿では,勾配に基づくアルゴリズムの設計と解析を行う新しい手法を機械学習に直接応用する。
参考スコア（独自算出の注目度）: 6.896308995955336
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Gradient-based minimax optimal algorithms have greatly promoted the development of continuous optimization and machine learning. One seminal work due to Yurii Nesterov [Nes83a] established $\tilde{\mathcal{O}}(\sqrt{L/\mu})$ gradient complexity for minimizing an $L$-smooth $\mu$-strongly convex objective. However, an ideal algorithm would adapt to the explicit complexity of a particular objective function and incur faster rates for simpler problems, triggering our reconsideration of two defeats of existing optimization modeling and analysis. (i) The worst-case optimality is neither the instance optimality nor such one in reality. (ii) Traditional $L$-smoothness condition may not be the primary abstraction/characterization for modern practical problems. In this paper, we open up a new way to design and analyze gradient-based algorithms with direct applications in machine learning, including linear regression and beyond. We introduce two factors $(\alpha, \tau_{\alpha})$ to refine the description of the degenerated condition of the optimization problems based on the observation that the singular values of Hessian often drop sharply. We design adaptive algorithms that solve simpler problems without pre-known knowledge with reduced gradient or analogous oracle accesses. The algorithms also improve the state-of-art complexities for several problems in machine learning, thereby solving the open problem of how to design faster algorithms in light of the known complexity lower bounds. Specially, with the $\mathcal{O}(1)$-nuclear norm bounded, we achieve an optimal $\tilde{\mathcal{O}}(\mu^{-1/3})$ (v.s. $\tilde{\mathcal{O}}(\mu^{-1/2})$) gradient complexity for linear regression. We hope this work could invoke the rethinking for understanding the difficulty of modern problems in optimization.
Abstract（参考訳）: 勾配に基づくミニマックス最適アルゴリズムは、連続最適化と機械学習の開発を大いに促進してきた。 yurii nesterov [nes83a] による独創的な研究により、$l$-smooth $\mu$-strongly convex の目的を最小化するために$\tilde{\mathcal{o}}(\sqrt{l/\mu})$gradient complexity が確立された。しかし、理想的なアルゴリズムは、特定の目的関数の明示的な複雑さに適応し、より単純な問題に対してより速いレートを発生させ、既存の最適化モデリングと解析の2つの敗北を再考する。 (i)最悪のケースの最適性は、インスタンスの最適性でもそのようなものでもない。 (ii)従来のl$-smoothness条件は、現代の実用的な問題の主要な抽象化/キャラクタリゼーションではないかもしれない。本稿では,線形回帰などを含む機械学習を直接応用した勾配に基づくアルゴリズムの設計と解析を行う新しい手法を公開する。我々は、ヘッセンの特異値がしばしば急降下する観察に基づいて最適化問題の退化条件の記述を洗練させるために、2つの因子$(\alpha, \tau_{\alpha})$を導入する。我々は、勾配の低下やoracleアクセスの類似した知識なしで、より単純な問題を解決する適応アルゴリズムを設計します。このアルゴリズムはまた、機械学習におけるいくつかの問題に対する最先端の複雑さを改善し、既知の複雑さの低い境界を考慮してより高速なアルゴリズムを設計する方法というオープンな問題を解決する。特に、$\tilde{\mathcal{O}(1)$-核ノルムを有界にすると、線形回帰に対して最適な$\tilde{\mathcal{O}}(\mu^{-1/3})$ (v.s.$\tilde{\mathcal{O}}(\mu^{-1/2})$)勾配複雑性が得られる。この研究が、最適化における現代の問題の難しさを理解するための再考を呼び起こせることを願っている。

論文の概要: Accelerated Gradient Algorithms with Adaptive Subspace Search for Instance-Faster Optimization

関連論文リスト