Fugu-MT 論文翻訳(概要): $O(k)$-Equivariant Dimensionality Reduction on Stiefel Manifolds

論文の概要: $O(k)$-Equivariant Dimensionality Reduction on Stiefel Manifolds

arxiv url: http://arxiv.org/abs/2309.10775v3
Date: Thu, 20 Feb 2025 14:40:53 GMT
ステータス: 翻訳完了
システム内更新日: 2025-02-21 15:38:29.364789
Title: $O(k)$-Equivariant Dimensionality Reduction on Stiefel Manifolds
Title（参考訳）: スティーフェル多様体上の$O(k)$-等変次元性還元
Authors: Andrew Lee, Harlin Lee, Jose A. Perea, Nikolas Schonsheck, Madeleine Weinstein,
Abstract要約: 多くの実世界のデータセットは、高次元のスティーフェル多様体とグラスマン多様体に、それぞれ$V_k(mathbbRN)$と$Gr(k, mathbbRN)$で存在する。我々はtextitPrincipal Stiefel Coordinates (PSC) というアルゴリズムを提案し、データ次元を$V_k(mathbbRN)$から$V_k(mathbbRn)$に減らした。
参考スコア（独自算出の注目度）: 2.0818404738530525
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Many real-world datasets live on high-dimensional Stiefel and Grassmannian manifolds, $V_k(\mathbb{R}^N)$ and $Gr(k, \mathbb{R}^N)$ respectively, and benefit from projection onto lower-dimensional Stiefel and Grassmannian manifolds. In this work, we propose an algorithm called \textit{Principal Stiefel Coordinates (PSC)} to reduce data dimensionality from $ V_k(\mathbb{R}^N)$ to $V_k(\mathbb{R}^n)$ in an \textit{$O(k)$-equivariant} manner ($k \leq n \ll N$). We begin by observing that each element $\alpha \in V_n(\mathbb{R}^N)$ defines an isometric embedding of $V_k(\mathbb{R}^n)$ into $V_k(\mathbb{R}^N)$. Next, we describe two ways of finding a suitable embedding map $\alpha$: one via an extension of principal component analysis ($\alpha_{PCA}$), and one that further minimizes data fit error using gradient descent ($\alpha_{GD}$). Then, we define a continuous and $O(k)$-equivariant map $\pi_\alpha$ that acts as a "closest point operator" to project the data onto the image of $V_k(\mathbb{R}^n)$ in $V_k(\mathbb{R}^N)$ under the embedding determined by $\alpha$, while minimizing distortion. Because this dimensionality reduction is $O(k)$-equivariant, these results extend to Grassmannian manifolds as well. Lastly, we show that $\pi_{\alpha_{PCA}}$ globally minimizes projection error in a noiseless setting, while $\pi_{\alpha_{GD}}$ achieves a meaningfully different and improved outcome when the data does not lie exactly on the image of a linearly embedded lower-dimensional Stiefel manifold as above. Multiple numerical experiments using synthetic and real-world data are performed.
Abstract（参考訳）: 多くの実世界のデータセットは、高次元のスティーフェル多様体とグラスマン多様体、それぞれ$V_k(\mathbb{R}^N)$と$Gr(k, \mathbb{R}^N)$の上にあり、低次元のスティーフェル多様体とグラスマン多様体への射影の恩恵を受ける。本研究では、データ次元を$V_k(\mathbb{R}^N)$から$V_k(\mathbb{R}^n)$へ還元するアルゴリズムを、 \textit{$O(k)$-equivariant} manner(k \leq n \ll N$)で提案する。まず、各元 $\alpha \in V_n(\mathbb{R}^N)$ が $V_k(\mathbb{R}^n)$ の等尺埋め込みを $V_k(\mathbb{R}^N)$ に定義することから始める。次に、適切な埋め込みマップを$\alpha$: 主成分分析の拡張($\alpha_{PCA}$)と、勾配降下($\alpha_{GD}$)を用いてデータ適合誤差をさらに最小化する2つの方法を説明する。次に、連続かつ$O(k)$-equivariant map $\pi_\alpha$ を「最も近い点演算子」として作用させ、そのデータを $V_k(\mathbb{R}^n)$ in $V_k(\mathbb{R}^N)$ の像に投影する。この次元還元は$O(k)$-同変であるため、これらの結果はグラスマン多様体にも拡張される。最後に、$\pi_{\alpha_{PCA}}$はノイズのない環境での射影誤差を大域的に最小化するのに対し、$\pi_{\alpha_{GD}}$は、データが直線的に埋め込まれた低次元スティーフェル多様体の像に正確には属さない場合に有意に異なる改善された結果が得られることを示す。合成および実世界のデータを用いた複数の数値実験を行う。

関連論文リスト

Conditional regression for the Nonlinear Single-Variable Model [4.565636963872865]
F(X):=f(Pi_gamma):mathbbRdto[0,rmlen_gamma]$ ここで$Pi_gamma: [0,rmlen_gamma]tomathbbRd$と$f:[0,rmlen_gamma]tomathbbR1$を考える。条件回帰に基づく非パラメトリック推定器を提案し、$one$-dimensionalOptimical min-maxレートを実現できることを示す。
論文参考訳（メタデータ） (2024-11-14T18:53:51Z)
Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
線形スケッチを用いた行列とベクトルノルムの残差誤差推定問題について検討する。これは、前作とほぼ同じスケッチサイズと精度で、経験的にかなり有利であることを示す。また、スパースリカバリ問題に対して$Omega(k2/pn1-2/p)$低いバウンダリを示し、これは$mathrmpoly(log n)$ factorまで厳密である。
論文参考訳（メタデータ） (2024-08-16T02:33:07Z)
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
単一インデックス対象関数 $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$ の等方的ガウスデータの下で勾配降下学習の問題を考察する。 SGDアルゴリズムで最適化された2層ニューラルネットワークは、サンプル付き任意のリンク関数の$f_*$を学習し、実行時の複雑さは$n asymp T asymp C(q) cdot dであることを示す。
論文参考訳（メタデータ） (2024-06-03T17:56:58Z)
Provably learning a multi-head attention layer [55.2904547651831]
マルチヘッドアテンション層は、従来のフィードフォワードモデルとは分離したトランスフォーマーアーキテクチャの重要な構成要素の1つである。本研究では,ランダムな例から多面的注意層を実証的に学習する研究を開始する。最悪の場合、$m$に対する指数的依存は避けられないことを示す。
論文参考訳（メタデータ） (2024-02-06T15:39:09Z)
Geometric structure of shallow neural networks and constructive ${\mathcal L}^2$ cost minimization [1.189367612437469]
隠れた1つの層を持つ浅層ニューラルネットワーク、ReLUアクティベーション関数、$mathcal L2$ Schattenクラス(Hilbert-Schmidt)のコスト関数を考える。我々は、$O(delta_P)$のコスト関数の最小値に対して、$delta_P$の信号とトレーニング入力のノイズ比を測る上限を証明した。特別の場合、$M=Q$ において、コスト関数の正確な退化局所極小を明示的に決定し、そのシャープ値が a の$Qleq M$ に対して得られる上限値と異なることを示す。
論文参考訳（メタデータ） (2023-09-19T07:12:41Z)
An Over-parameterized Exponential Regression [18.57735939471469]
LLM(Large Language Models)の分野での最近の発展は、指数的アクティベーション関数の使用への関心を喚起している。ニューラル関数 $F: mathbbRd times m times mathbbRd times mathbbRd times mathbbRd times mathbbRd times mathbbRd times mathbbRd times mathbbRd times mathbbRdd
論文参考訳（メタデータ） (2023-03-29T07:29:07Z)
Learning a Single Neuron with Adversarial Label Noise via Gradient Descent [50.659479930171585]
モノトン活性化に対する $mathbfxmapstosigma(mathbfwcdotmathbfx)$ の関数について検討する。学習者の目標は仮説ベクトル $mathbfw$ that $F(mathbbw)=C, epsilon$ を高い確率で出力することである。
論文参考訳（メタデータ） (2022-06-17T17:55:43Z)
Deep Learning in High Dimension: Neural Network Approximation of Analytic Functions in $L^2(\mathbb{R}^d,\gamma_d)$ [0.0]
解析関数 $f:mathbbRdtomathbbR$ の式率を $L2(mathbbRd,gamma_d)$ のノルムで証明する。特に、整数 $kgeq 2$ に対する ReLU と ReLU$k$ のアクティベーションを考える。対数ガウス確率場入力による楕円型PDEの応答面に対する深いReLU-NNの表現速度境界を証明した。
論文参考訳（メタデータ） (2021-11-13T09:54:32Z)
Random matrices in service of ML footprint: ternary random features with no performance loss [55.30329197651178]
我々は、$bf K$ の固有スペクトルが$bf w$ の i.d. 成分の分布とは独立であることを示す。 3次ランダム特徴(TRF)と呼ばれる新しいランダム手法を提案する。提案したランダムな特徴の計算には乗算が不要であり、古典的なランダムな特徴に比べてストレージに$b$のコストがかかる。
論文参考訳（メタデータ） (2021-10-05T09:33:49Z)
Small Covers for Near-Zero Sets of Polynomials and Learning Latent Variable Models [56.98280399449707]
我々は、s$ of cardinality $m = (k/epsilon)o_d(k1/d)$ に対して $epsilon$-cover が存在することを示す。構造的結果に基づいて,いくつかの基本的高次元確率モデル隠れ変数の学習アルゴリズムを改良した。
論文参考訳（メタデータ） (2020-12-14T18:14:08Z)
A Canonical Transform for Strengthening the Local $L^p$-Type Universal Approximation Property [4.18804572788063]
任意の機械学習モデルクラス $mathscrFsubseteq C(mathbbRd,mathbbRD)$ が $Lp_mu(mathbbRd,mathbbRD)$ で密であることを保証する。本稿では、「$mathscrF$'s approximation property」という正準変換を導入することにより、この近似理論問題に対する一般的な解を提案する。
論文参考訳（メタデータ） (2020-06-24T17:46:35Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。