Fugu-MT 論文翻訳(概要): Power-Law Spectrum of the Random Feature Model

論文の概要: Power-Law Spectrum of the Random Feature Model

arxiv url: http://arxiv.org/abs/2603.14578v1
Date: Sun, 15 Mar 2026 19:54:34 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-17 16:19:35.89572
Title: Power-Law Spectrum of the Random Feature Model
Title（参考訳）: ランダム特徴モデルのパワールースペクトル
Authors: Elliot Paquette, Ke Liang Xiao, Yizhe Zhu,
Abstract要約: 集団ランダムな特徴を持つ共分散 $mathbbE_x [frac1df(Wtop x )otimes 2]$ の固有値を特徴づける。すべての$leq j leq d log-(p+1)(d)$に対して、$j$-th 固有値は次数$left(logp-1(j+1)/jright)$である。
参考スコア（独自算出の注目度）: 11.318593165494724
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Scaling laws for neural networks, in which the loss decays as a power-law in the number of parameters, data, and compute, depend fundamentally on the spectral structure of the data covariance, with power-law eigenvalue decay appearing ubiquitously in vision and language tasks. A central question is whether this spectral structure is preserved or destroyed when data passes through the basic building block of a neural network: a random linear projection followed by a nonlinear activation. We study this question for the random feature model: given data $x \sim N(0,H)\in \mathbb{R}^v$ where $H$ has $α$-power-law spectrum ($λ_j(H ) \asymp j^{-α}$, $α> 1$), a Gaussian sketch matrix $W \in \mathbb{R}^{v\times d}$, and an entrywise monomial $f(y) = y^{p}$, we characterize the eigenvalues of the population random-feature covariance $\mathbb{E}_{x }[\frac{1}{d}f(W^\top x )^{\otimes 2}]$. We prove matching upper and lower bounds: for all $1 \leq j \leq c_1 d \log^{-(p+1)}(d)$, the $j$-th eigenvalue is of order $\left(\log^{p-1}(j+1)/j\right)^α$. For $ c_1 d \log^{-(p+1)}(d)\leq j\leq d$, the $j$-th eigenvalue is of order $j^{-α}$ up to a polylog factor. That is, the power-law exponent $α$ is inherited exactly from the input covariance, modified only by a logarithmic correction that depends on the monomial degree $p$. The proof combines a dyadic head-tail decomposition with Wick chaos expansions for higher-order monomials and random matrix concentration inequalities.
Abstract（参考訳）: ニューラルネットワークのスケール法則は、損失がパラメータ、データ、計算数においてパワーローとして減衰するが、基本的にはデータ共分散のスペクトル構造に依存し、パワーロー固有値の減衰は視覚や言語タスクにおいて普遍的に現れる。中心的な問題は、このスペクトル構造が、データがニューラルネットワークの基本的なビルディングブロックを通過するときに保存されるか、破壊されるかである。データ $x \sim N(0,H)\in \mathbb{R}^v$ where $H$ has $α$-power-law spectrum ($λ_j(H) \asymp j^{-α}$, $α> 1$), a Gaussian sketch matrix $W \in \mathbb{R}^{v\times d}$, and a entrywise monomial $f(y) = y^{p}$, we characterizedize the eigenvalues of the population random-feature covariance $\mathbb{E}_{x }[\frac{1}{d}f(W^\top x )^{\otimes 2}$。上界と下界の整合性を証明する: すべての 1 に対して、$j$-次固有値は$\left(\log^{p-1}(j+1)/j\right)^α$ である。 c_1 d \log^{-(p+1)}(d)\leq j\leq d$ の場合、$j$-th の固有値は次数$j^{-α}$ である。すなわち、パワーロー指数$α$は入力共分散から正確に継承され、単項次数$p$に依存する対数補正によってのみ修正される。この証明は、高次単項数とランダム行列濃度の不等式に対するワイクカオス展開と、ダイディックヘッドテール分解を組み合わせたものである。

論文の概要: Power-Law Spectrum of the Random Feature Model

関連論文リスト