Fugu-MT 論文翻訳(概要): Random matrices in service of ML footprint: ternary random features with no performance loss

論文の概要: Random matrices in service of ML footprint: ternary random features with no performance loss

arxiv url: http://arxiv.org/abs/2110.01899v1
Date: Tue, 5 Oct 2021 09:33:49 GMT
ステータス: 翻訳完了
システム内更新日: 2021-10-06 21:02:48.415842
Title: Random matrices in service of ML footprint: ternary random features with no performance loss
Title（参考訳）: mlフットプリントサービスにおけるランダム行列:性能損失のない3次ランダム特徴
Authors: Hafiz Tiomoko Ali, Zhenyu Liao, Romain Couillet
Abstract要約: 我々は、$bf K$ の固有スペクトルが$bf w$ の i.d. 成分の分布とは独立であることを示す。 3次ランダム特徴(TRF)と呼ばれる新しいランダム手法を提案する。提案したランダムな特徴の計算には乗算が不要であり、古典的なランダムな特徴に比べてストレージに$b$のコストがかかる。
参考スコア（独自算出の注目度）: 55.30329197651178
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this article, we investigate the spectral behavior of random features kernel matrices of the type ${\bf K} = \mathbb{E}_{{\bf w}} \left[\sigma\left({\bf w}^{\sf T}{\bf x}_i\right)\sigma\left({\bf w}^{\sf T}{\bf x}_j\right)\right]_{i,j=1}^n$, with nonlinear function $\sigma(\cdot)$, data ${\bf x}_1, \ldots, {\bf x}_n \in \mathbb{R}^p$, and random projection vector ${\bf w} \in \mathbb{R}^p$ having i.i.d. entries. In a high-dimensional setting where the number of data $n$ and their dimension $p$ are both large and comparable, we show, under a Gaussian mixture model for the data, that the eigenspectrum of ${\bf K}$ is independent of the distribution of the i.i.d.(zero-mean and unit-variance) entries of ${\bf w}$, and only depends on $\sigma(\cdot)$ via its (generalized) Gaussian moments $\mathbb{E}_{z\sim \mathcal N(0,1)}[\sigma'(z)]$ and $\mathbb{E}_{z\sim \mathcal N(0,1)}[\sigma''(z)]$. As a result, for any kernel matrix ${\bf K}$ of the form above, we propose a novel random features technique, called Ternary Random Feature (TRF), that (i) asymptotically yields the same limiting kernel as the original ${\bf K}$ in a spectral sense and (ii) can be computed and stored much more efficiently, by wisely tuning (in a data-dependent manner) the function $\sigma$ and the random vector ${\bf w}$, both taking values in $\{-1,0,1\}$. The computation of the proposed random features requires no multiplication, and a factor of $b$ times less bits for storage compared to classical random features such as random Fourier features, with $b$ the number of bits to store full precision values. Besides, it appears in our experiments on real data that the substantial gains in computation and storage are accompanied with somewhat improved performances compared to state-of-the-art random features compression/quantization methods.
Abstract（参考訳）: 本稿では、非線型関数 $\sigma(\cdot)$, data ${\bf x}_1, \ldots, {\bf x}_n \mathbb{R}^p$, and random vector ${\bf w} \mathbb{R}^p$, and random vector ${\bf w}^{\sf T}{\bf x}_i\right)\right]_{i,j=1}^n$, with linear function $\sigma(\cdot)$, data ${\bf x}_1, \ldots, {\bf x}_n \mathbb{R}^p$, and random vector ${\bf w}^{\sf T}{\bf x}_i\right)\right)\right]_{i,j=1}^p$,。 n$ とそれらの次元 $p$ がともに大きい高次元の設定において、データのガウス混合モデルの下では、${\bf k}$ の固有スペクトルは ${\bf w}$ の i.i.d.(0-mean and unit-variance) の成分の分布とは独立であり、その(一般化された)ガウス的モーメントである $\mathbb{e}_{z\sim \mathcal n(0,1)}[\sigma'(z)]$ と $\mathbb{e}_{z\sim \mathcal n(0,1)}[\sigma'(z)]$と$\mathbb{e}_{z\sim \mathcal n(0,1)} のみに依存する。その結果、上記の形の任意のカーネル行列${\bf K}$に対して、三次ランダム特徴(TRF)と呼ばれる新しいランダム特徴技術を提案する。 (i)漸近的に、スペクトル意味で元の${\bf k}$と同じ制限核を生じさせ、 (ii) 関数 $\sigma$ とランダムベクトル ${\bf w}$ を巧みに(データに依存して)チューニングすることで、より効率的に計算し、格納することができる。提案されたランダムな特徴の計算は、乗算を必要とせず、ランダムなフーリエ特徴のような古典的ランダムな特徴に比べてストレージにb$のビットを要せず、完全な精度値を格納するビット数をb$とする。さらに, 実データでは, 計算と記憶の大幅な向上が, 最先端のランダムな特徴圧縮/量子化法と比較して若干改善された性能を伴っていることが明らかとなった。

関連論文リスト

Allocating Variance to Maximize Expectation [2.25491649634702]
ガウス確率変数の系列の上限を最大化するための効率的な近似アルゴリズムを設計する。このような期待問題は、ユーティリティオークションから、定量的遺伝学の混合モデルを学ぶことまで、様々な応用で発生する。
論文参考訳（メタデータ） (2025-02-25T18:59:46Z)
Sample and Computationally Efficient Robust Learning of Gaussian Single-Index Models [37.42736399673992]
シングルインデックスモデル (SIM) は $sigma(mathbfwast cdot mathbfx)$ という形式の関数であり、$sigma: mathbbR to mathbbR$ は既知のリンク関数であり、$mathbfwast$ は隠れ単位ベクトルである。適切な学習者が$L2$-error of $O(mathrmOPT)+epsilon$。
論文参考訳（メタデータ） (2024-11-08T17:10:38Z)
Provably learning a multi-head attention layer [55.2904547651831]
マルチヘッドアテンション層は、従来のフィードフォワードモデルとは分離したトランスフォーマーアーキテクチャの重要な構成要素の1つである。本研究では,ランダムな例から多面的注意層を実証的に学習する研究を開始する。最悪の場合、$m$に対する指数的依存は避けられないことを示す。
論文参考訳（メタデータ） (2024-02-06T15:39:09Z)
A Unified Framework for Uniform Signal Recovery in Nonlinear Generative Compressed Sensing [68.80803866919123]
非線形測定では、ほとんどの先行結果は一様ではない、すなわち、すべての$mathbfx*$に対してではなく、固定された$mathbfx*$に対して高い確率で保持される。本フレームワークはGCSに1ビット/一様量子化観測と単一インデックスモデルを標準例として適用する。また、指標集合が計量エントロピーが低い製品プロセスに対して、より厳密な境界を生み出す濃度不等式も開発する。
論文参考訳（メタデータ） (2023-09-25T17:54:19Z)
Learning a Single Neuron with Adversarial Label Noise via Gradient Descent [50.659479930171585]
モノトン活性化に対する $mathbfxmapstosigma(mathbfwcdotmathbfx)$ の関数について検討する。学習者の目標は仮説ベクトル $mathbfw$ that $F(mathbbw)=C, epsilon$ を高い確率で出力することである。
論文参考訳（メタデータ） (2022-06-17T17:55:43Z)
Spectral properties of sample covariance matrices arising from random matrices with independent non identically distributed columns [50.053491972003656]
関数 $texttr(AR(z))$, for $R(z) = (frac1nXXT- zI_p)-1$ and $Ain mathcal M_p$ deterministic, have a standard deviation of order $O(|A|_* / sqrt n)$. ここでは、$|mathbb E[R(z)] - tilde R(z)|_F を示す。
論文参考訳（メタデータ） (2021-09-06T14:21:43Z)
Kernel Thinning [26.25415159542831]
カーネルの薄型化は、サンプリングや標準的な薄型化よりも効率的に$mathbbP$を圧縮するための新しい手順である。我々は、ガウス、マタン、およびB-スプライン核に対する明示的な非漸近的な最大誤差境界を導出する。
論文参考訳（メタデータ） (2021-05-12T17:56:42Z)
Two-way kernel matrix puncturing: towards resource-efficient PCA and spectral clustering [43.50783459690612]
この方法は、データマトリックス$XinmathbbCptimes n$と対応するカーネル(Gram)マトリックス$K$の両方をBernoulliマスクを介してランダムに「切断」する。我々は、GAN生成した画像データベースを実証的に確認し、データを劇的にパンクし、巨大な計算とストレージのゲインを提供することができることを確認した。
論文参考訳（メタデータ） (2021-02-24T14:01:58Z)
Convergence of Sparse Variational Inference in Gaussian Processes Regression [29.636483122130027]
計算コストが$mathcalO(log N)2D(log N)2)$の手法を推論に利用できることを示す。
論文参考訳（メタデータ） (2020-08-01T19:23:34Z)
Linear Time Sinkhorn Divergences using Positive Features [51.50788603386766]
エントロピー正則化で最適な輸送を解くには、ベクトルに繰り返し適用される$ntimes n$ kernel matrixを計算する必要がある。代わりに、$c(x,y)=-logdotpvarphi(x)varphi(y)$ ここで$varphi$は、地上空間から正のorthant $RRr_+$への写像であり、$rll n$である。
論文参考訳（メタデータ） (2020-06-12T10:21:40Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。