Fugu-MT 論文翻訳(概要): Adaptive Lipschitz-Free Conditional Gradient Methods for Stochastic Composite Nonconvex Optimization

論文の概要: Adaptive Lipschitz-Free Conditional Gradient Methods for Stochastic Composite Nonconvex Optimization

arxiv url: http://arxiv.org/abs/2603.06369v1
Date: Fri, 06 Mar 2026 15:20:52 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-09 13:17:46.025193
Title: Adaptive Lipschitz-Free Conditional Gradient Methods for Stochastic Composite Nonconvex Optimization
Title（参考訳）: 確率的複合非凸最適化のための適応リプシッツ自由条件勾配法
Authors: Ganzhao Yuan,
Abstract要約: ALFCG, ALFCGは, 合成非追従変種に対する最初のテクスタダプティブなプロジェクションフリーフレームワークである。オープンループ減少ステップを使用する従来の方法とは異なり、ALFCGは自己正規化アキュムレータを維持している。 ALFCG-FSが$tildemathcalO(2+-2$と$tildemathcalO(-3+-2)$to 0$を獲得し、ALFCG-MVR2が$tildemathcalO(2+を達成する
参考スコア（独自算出の注目度）: 23.28384210732827
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose ALFCG (Adaptive Lipschitz-Free Conditional Gradient), the first \textit{adaptive} projection-free framework for stochastic composite nonconvex minimization that \textit{requires neither global smoothness constants nor line search}. Unlike prior conditional gradient methods that use openloop diminishing stepsizes, conservative Lipschitz constants, or costly backtracking, ALFCG maintains a self-normalized accumulator of historical iterate differences to estimate local smoothness and minimize a quadratic surrogate model at each step. This retains the simplicity of Frank-Wolfe while adapting to unknown geometry. We study three variants. ALFCG-FS addresses finite-sum problems with a SPIDER estimator. ALFCG-MVR1 and ALFCG-MVR2 handle stochastic expectation problems by using momentum-based variance reduction with single-batch and two-batch updates, and operate under average and individual smoothness, respectively. To reach an $ε$-stationary point, ALFCG-FS attains $\mathcal{O}(N+\sqrt{N}ε^{-2})$ iteration complexity, while ALFCG-MVR1 and ALFCG-MVR2 achieve $\tilde{\mathcal{O}}(σ^2ε^{-4}+ε^{-2})$ and $\tilde{\mathcal{O}}(σε^{-3}+ε^{-2})$, where $N$ is the number of components and $σ$ is the noise level. In contrast to typical $\mathcal{O}(ε^{-4})$ or $\mathcal{O}(ε^{-3})$ rates, our bounds reduce to the optimal rate up to logarithmic factors $\tilde{\mathcal{O}}(ε^{-2})$ as the noise level $σ\to 0$. Extensive experiments on multiclass classification over nuclear norm balls and $\ell_p$ balls show that ALFCG generally outperforms state-of-the-art conditional gradient baselines.
Abstract（参考訳）: 本稿では, ALFCG (Adaptive Lipschitz-Free Conditional Gradient) を提案する。オープンループ減少段数、保守的なリプシッツ定数、あるいはコストのかかるバックトラックを使用する以前の条件勾配法とは異なり、ALFCGは、局所的な滑らかさを推定し、各ステップにおける二次代理モデルを最小化するために、歴史的反復差の自己正規化アキュレータを維持している。これは未知の幾何学に適応しながら、フランク=ウルフの単純さを保っている。我々は3つの変種を研究する。 ALFCG-FSはSPIDER推定器を用いて有限サム問題に対処する。 ALFCG-MVR1とALFCG-MVR2は、シングルバッチと2バッチの更新によるモーメントベースの分散還元を用いて確率予測問題に対処し、それぞれ平均と個別の滑らかさの下で動作する。一方、ALFCG-MVR1とALFCG-MVR2は$\tilde{\mathcal{O}}(σ^2ε^{-4}+ε^{-2})$と$\tilde{\mathcal{O}}(σε^{-3}+ε^{-2})$である。典型的な $\mathcal{O}(ε^{-4})$ または $\mathcal{O}(ε^{-3})$ とは対照的に、雑音レベル $σ\to 0$ として対数係数 $\tilde{\mathcal{O}}(ε^{-2})$ までの最適速度に還元される。核ノルムボールと$\ell_p$ボールの多クラス分類に関する大規模な実験は、ALFCGが一般に最先端の条件勾配ベースラインより優れていることを示している。

関連論文リスト

Progressive Power Homotopy for Non-convex Optimization [5.737648067191245]
我々は$max_bmwinmathbbdmathbbdmathbbdmathR bmxsimmathcalDという形の非最適化のための新しい一階法を提案する。我々は,Prog-PowerHPが大域的最適化の小さな近傍に収束し,およそ$O(d2varepsilon-2)$であることを示した。
論文参考訳（メタデータ） (2026-01-22T12:44:25Z)
Decentralized Stochastic Nonconvex Optimization under the Relaxed Smoothness [21.090579632247707]
分散正規化勾配勾配(DNS)という新しいアルゴリズムを提案する。 DNSは各ローカルエージェントで$bold$ilonポイントを達成することができる。以上より, $mathcal O(m-1(L_fsigma2Delta_fepsilon-4 + sigma2epsilon-2 + L_f-1)$ per agent。
論文参考訳（メタデータ） (2025-09-10T16:17:19Z)
Adaptive Extrapolated Proximal Gradient Methods with Variance Reduction for Composite Nonconvex Finite-Sum Minimization [7.9047096855236125]
本稿では, 適応外挿 Prox Gradient (AEPG) 法である sf-SPIDER を提案する。固定点を見つけるために$jascalO(N + sqrtN epsilon)$を統合する。従来の手法と比較して, sf AEPG-SPIDERの優れた性能を示す実験を行った。
論文参考訳（メタデータ） (2025-02-28T14:37:56Z)
Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes [29.466981306355066]
固定学習率$eta$の勾配降下はスムーズな関数を表す局所最小値しか見つからないことを示す。また、$n$のデータポイントのサポートの厳密な内部で、$widetildeO(n-4/5)$のほぼ最適MSE境界を証明します。
論文参考訳（メタデータ） (2024-06-10T22:57:27Z)
Enhanced Adaptive Gradient Algorithms for Nonconvex-PL Minimax Optimization [41.28002701420715]
ミニマックス最適化は多くの機械学習タスクに広く応用されている。我々は,本手法が特定の型に依存することなく,最もよく知られたサンプル複雑性を有することを示す。
論文参考訳（メタデータ） (2023-03-07T15:33:12Z)
ReSQueing Parallel and Private Stochastic Convex Optimization [59.53297063174519]
本稿では,BFG凸最適化(SCO: Reweighted Query (ReSQue) 推定ツールを提案する。我々はSCOの並列およびプライベート設定における最先端の複雑さを実現するアルゴリズムを開発した。
論文参考訳（メタデータ） (2023-01-01T18:51:29Z)
Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization [52.25843977506935]
有限サム構造をもつ$L$-smooth, non-deuction関数に対して, AdaSpider と呼ばれる適応分散法を提案する。そうすることで、$tildeOleft + st/epsilonコールで$epsilon-stationaryポイントを計算することができます。
論文参考訳（メタデータ） (2022-11-03T14:41:46Z)
Sharper Convergence Guarantees for Asynchronous SGD for Distributed and Federated Learning [77.22019100456595]
通信周波数の異なる分散計算作業者のトレーニングアルゴリズムを示す。本研究では,より厳密な収束率を$mathcalO!!(sigma2-2_avg!)とする。また,不均一性の項は,作業者の平均遅延によっても影響されることを示した。
論文参考訳（メタデータ） (2022-06-16T17:10:57Z)
Perseus: A Simple and Optimal High-Order Method for Variational Inequalities [81.32967242727152]
VI は、$langle F(x), x - xstarrangle geq 0$ for all $x in MathcalX$ であるように、mathcalX$ で $xstar を見つける。そこで本稿では,テキストitが行探索を必要とせず,$O(epsilon-2/(p+1))$で弱解に確実に収束する$pth$-order法を提案する。
論文参考訳（メタデータ） (2022-05-06T13:29:14Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
学習者が生成モデル$Y = langle X,w* rangle + epsilon$から$n$のサンプルにアクセスできるような高次元頑健な線形回帰問題について検討する。 i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance, (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
論文参考訳（メタデータ） (2020-07-16T06:44:44Z)
Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
期待される正方形損失から、最も適合した単一ニューロンを学習することの問題点を考察する。 ReLUアクティベーションでは、我々の人口リスク保証は$O(mathsfOPT1/2)+epsilon$である。 ReLUアクティベーションでは、我々の人口リスク保証は$O(mathsfOPT1/2)+epsilon$である。
論文参考訳（メタデータ） (2020-05-29T07:20:35Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。