Fugu-MT 論文翻訳(概要): Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index Model

論文の概要: Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index Model

arxiv url: http://arxiv.org/abs/2606.15219v1
Date: Sat, 13 Jun 2026 09:34:39 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-16 16:21:33.061358
Title: Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index Model
Title（参考訳）: ニューラルネットワークは最適計算統計トレードオフを実現することができるか? : 単一インデックスモデルによる解析
Authors: Siyu Chen, Beining Wu, Miao Lu, Zhuoran Yang, Tianhao Wang,
Abstract要約: 本稿では,2層ニューラルネットワークを時間内にトレーニングするための勾配に基づくアルゴリズムを提案する。このアルゴリズムは未知の信号$star$と強く一致したスパース表現を学習することを示す。私たちは、$star$が$k$-sparse for $k = o(sqrtd)$という設定にアプローチを拡張します。
参考スコア（独自算出の注目度）: 53.6316818897326
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this work, we tackle the following question: Can neural networks trained with gradient-based methods achieve the optimal computational-statistical tradeoff in learning Gaussian single-index models? Prior research has shown that any polynomial-time algorithm under the statistical query (SQ) framework requires $Ω(d^{s^\star/2}\lor d)$ samples, where $s^\star$ is the generative exponent representing the intrinsic difficulty of learning the underlying model. However, it remains unknown whether neural networks can achieve this sample complexity. Inspired by prior techniques such as label transformation and landscape smoothing for learning single-index models, we propose a unified gradient-based algorithm for training a two-layer neural network in polynomial time. Our method is adaptable to a variety of loss and activation functions, covering a broad class of existing approaches. We show that our algorithm learns a feature representation that strongly aligns with the unknown signal $θ^\star$, with sample complexity $\widetilde{O} (d^{s^\star/2} \lor d)$, matching the SQ lower bound up to a polylogarithmic factor for all generative exponents $s^\star\geq 1$. Furthermore, we extend our approach to the setting where $θ^\star$ is $k$-sparse for $k = o(\sqrt{d})$ by introducing a novel weight perturbation technique that leverages the sparsity structure. We derive a corresponding SQ lower bound of order $\widetildeΩ(k^{s^\star})$, matched by our method up to a polylogarithmic factor. Our framework, especially the weight perturbation technique, is of independent interest, and suggests potential gradient-based solutions to other problems such as sparse tensor PCA.
Abstract（参考訳）: 勾配に基づく手法で訓練されたニューラルネットワークは、ガウスの単一インデックスモデルを学ぶ際に最適な計算統計的トレードオフを達成することができるか? 従来の研究では、統計的クエリー(SQ)フレームワークの多項式時間アルゴリズムには$Ω(d^{s^\star/2}\lor d)$サンプルが必要であることが示されている。しかし、ニューラルネットワークがこのサンプルの複雑さを達成できるかどうかは不明だ。単一インデックスモデル学習のためのラベル変換やランドスケープスムーシングといった先行技術に着想を得て, 2層ニューラルネットワークを多項式時間でトレーニングするための統一的勾配に基づくアルゴリズムを提案する。提案手法は,多様な損失・アクティベーション機能に適応し,既存手法の幅広いクラスをカバーする。我々のアルゴリズムは未知の信号である$θ^\star$と強い整合性を持つ特徴表現を、サンプル複雑性$\widetilde{O} (d^{s^\star/2} \lor d)$で学習し、SQの下限を全ての生成指数$s^\star\geq 1$に対してポリ対数係数に一致することを示す。さらに、このアプローチを$θ^\star$が$k = o(\sqrt{d})$に対して$k$-sparseとなるような設定にまで拡張し、空間構造を利用する新しい重み摂動手法を導入する。対応する SQ 下界の次数 $\widetildeΩ(k^{s^\star})$ を導出する。我々のフレームワーク、特に重み摂動技術は独立して興味を持ち、スパーステンソルPCAのような他の問題に対する潜在的な勾配に基づく解を提案する。

論文の概要: Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index Model

関連論文リスト