Fugu-MT 論文翻訳(概要): Universality of max-margin classifiers

論文の概要: Universality of max-margin classifiers

arxiv url: http://arxiv.org/abs/2310.00176v1
Date: Fri, 29 Sep 2023 22:45:56 GMT
ステータス: 翻訳完了
システム内更新日: 2023-10-05 05:59:46.299539
Title: Universality of max-margin classifiers
Title（参考訳）: max-margin分類器の普遍性
Authors: Andrea Montanari, Feng Ruan, Basil Saeed, Youngtak Sohn
Abstract要約: 非ガウス的特徴に対する誤分類誤差の高次元普遍性と大域化写像の役割について検討する。特に、オーバーパラメトリゼーションしきい値と一般化誤差はより単純なモデルで計算できる。
参考スコア（独自算出の注目度）: 10.797131009370219
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Maximum margin binary classification is one of the most fundamental algorithms in machine learning, yet the role of featurization maps and the high-dimensional asymptotics of the misclassification error for non-Gaussian features are still poorly understood. We consider settings in which we observe binary labels $y_i$ and either $d$-dimensional covariates ${\boldsymbol z}_i$ that are mapped to a $p$-dimension space via a randomized featurization map ${\boldsymbol \phi}:\mathbb{R}^d \to\mathbb{R}^p$, or $p$-dimensional features of non-Gaussian independent entries. In this context, we study two fundamental questions: $(i)$ At what overparametrization ratio $p/n$ do the data become linearly separable? $(ii)$ What is the generalization error of the max-margin classifier? Working in the high-dimensional regime in which the number of features $p$, the number of samples $n$ and the input dimension $d$ (in the nonlinear featurization setting) diverge, with ratios of order one, we prove a universality result establishing that the asymptotic behavior is completely determined by the expected covariance of feature vectors and by the covariance between features and labels. In particular, the overparametrization threshold and generalization error can be computed within a simpler Gaussian model. The main technical challenge lies in the fact that max-margin is not the maximizer (or minimizer) of an empirical average, but the maximizer of a minimum over the samples. We address this by representing the classifier as an average over support vectors. Crucially, we find that in high dimensions, the support vector count is proportional to the number of samples, which ultimately yields universality.
Abstract（参考訳）: 最大辺二分法分類は機械学習における最も基本的なアルゴリズムの1つであるが、非ガウス的特徴に対する誤分類誤差の高次元漸近性はいまだに理解されていない。我々は、二項ラベル $y_i$ および $d$-d covariates ${\boldsymbol z}_i$ を観測し、ランダム化されたフェアチュライゼーション写像 ${\boldsymbol \phi}:\mathbb{r}^d \to\mathbb{r}^p$ または非ガウジアン独立エントリの $p$-dimensional features of non-gausssian independent entry で$p$-dimension space にマッピングする設定を考える。この文脈では、2つの基本的な質問について研究する。 (i)$ オーバーパラメトリゼーション比$p/n$ では、データは線形分離可能か? $ (ii)$max-margin分類器の一般化誤差は何か? 特徴量$p$, サンプル数$n$, 入力次元$d$(非線形大域化設定において)が分岐する高次元状態において、次数1の比で、漸近的挙動が期待される特徴ベクトルの共分散と特徴とラベルの共分散によって完全に決定されることを示す普遍性結果が証明される。特に、超パラメータ閾値と一般化誤差はより単純なガウスモデル内で計算することができる。主な技術的課題は、マックスマージンが経験平均の最大値(または最小値)ではなく、サンプルに対する最小値の最大値であるという事実にある。我々は、分類器を平均オーバーサポートベクトルとして表現することでこの問題に対処する。重要なことに、高次元では、支持ベクトル数はサンプルの数に比例し、最終的には普遍性が得られる。

関連論文リスト

Entangled Mean Estimation in High-Dimensions [36.97113089188035]
信号のサブセットモデルにおける高次元エンタングルド平均推定の課題について検討する。最適誤差(polylogarithmic factor)は$f(alpha,N) + sqrtD/(alpha N)$であり、$f(alpha,N)$は1次元問題の誤差であり、第二項は準ガウス誤差率である。
論文参考訳（メタデータ） (2025-01-09T18:31:35Z)
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
単一インデックス対象関数 $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$ の勾配勾配勾配学習問題について検討する。 SGDに基づくアルゴリズムにより最適化された2層ニューラルネットワークは、情報指数に支配されない複雑さで$f_*$を学習する。
論文参考訳（メタデータ） (2024-06-03T17:56:58Z)
Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
単次元モデル(Single-Index Models)は、植木構造における高次元回帰問題である。我々は,統計的クエリ (SQ) と低遅延多項式 (LDP) フレームワークの両方において,計算効率のよいアルゴリズムが必ずしも$Omega(dkstar/2)$サンプルを必要とすることを示した。
論文参考訳（メタデータ） (2024-03-08T18:50:19Z)
Repeated Observations for Classification [0.2676349883103404]
繰り返し観測を行った結果,非パラメトリック分類の問題について検討した。本分析では, 名目密度によるロバスト検出, プロトタイプ分類, 線形変換, 線形分類, スケーリングなどのモデルについて検討する。
論文参考訳（メタデータ） (2023-07-19T10:50:36Z)
Dimension free ridge regression [10.434481202633458]
我々は、リッジ回帰のバイアスとばらつきの観点から、すなわちデータ上のリッジ回帰を再考し、等価なシーケンスモデルのバイアスとばらつきの観点から、リッジ回帰のバイアスとばらつきを考察する。新しい応用として、定期的に変化するスペクトルを持つヒルベルト共変量に対して、完全に明示的で鋭い尾根回帰特性を得る。
論文参考訳（メタデータ） (2022-10-16T16:01:05Z)
Approximate Function Evaluation via Multi-Armed Bandits [51.146684847667125]
既知の滑らかな関数 $f$ の値を未知の点 $boldsymbolmu in mathbbRn$ で推定する問題について検討する。我々は、各座標の重要性に応じてサンプルを学習するインスタンス適応アルゴリズムを設計し、少なくとも1-delta$の確率で$epsilon$の正確な推定値である$f(boldsymbolmu)$を返す。
論文参考訳（メタデータ） (2022-03-18T18:50:52Z)
Universality of empirical risk minimization [12.764655736673749]
例えば、$boldsymbol x_i inmathbbRp$ が特徴ベクトルで $y in mathbbR$ がラベルであるような i.d. サンプルからの教師付き学習を考える。我々は$mathsfkによってパラメータ化される関数のクラスに対する経験的リスク普遍性について研究する。
論文参考訳（メタデータ） (2022-02-17T18:53:45Z)
Classification of high-dimensional data with spiked covariance matrix structure [0.2741266294612775]
我々は高次元データの分類問題を$n$で研究し、$p$の特徴を観察する。本稿では,まず,次元還元空間における分類に先立って特徴ベクトルの次元還元を行う適応型分類器を提案する。結果の分類器は、$n rightarrow infty$ および $s sqrtn-1 ln p rightarrow 0$ のときにベイズ最適であることが示される。
論文参考訳（メタデータ） (2021-10-05T11:26:53Z)
Random matrices in service of ML footprint: ternary random features with no performance loss [55.30329197651178]
我々は、$bf K$ の固有スペクトルが$bf w$ の i.d. 成分の分布とは独立であることを示す。 3次ランダム特徴(TRF)と呼ばれる新しいランダム手法を提案する。提案したランダムな特徴の計算には乗算が不要であり、古典的なランダムな特徴に比べてストレージに$b$のコストがかかる。
論文参考訳（メタデータ） (2021-10-05T09:33:49Z)
Spectral properties of sample covariance matrices arising from random matrices with independent non identically distributed columns [50.053491972003656]
関数 $texttr(AR(z))$, for $R(z) = (frac1nXXT- zI_p)-1$ and $Ain mathcal M_p$ deterministic, have a standard deviation of order $O(|A|_* / sqrt n)$. ここでは、$|mathbb E[R(z)] - tilde R(z)|_F を示す。
論文参考訳（メタデータ） (2021-09-06T14:21:43Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
学習者が生成モデル$Y = langle X,w* rangle + epsilon$から$n$のサンプルにアクセスできるような高次元頑健な線形回帰問題について検討する。 i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance, (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
論文参考訳（メタデータ） (2020-07-16T06:44:44Z)
Consistent Structured Prediction with Max-Min Margin Markov Networks [84.60515484036239]
二項分類のためのマックスマージン法は、最大マージンマルコフネットワーク(M3N$)の名前で構造化予測設定まで拡張されている。我々は、学習問題を"max-min"マージンの定式化で定義し、結果のメソッドmax-minマージンマルコフネットワーク(M4N$)を命名することで、そのような制限を克服する。マルチクラス分類,順序回帰,シーケンス予測,ランキング実験により,提案手法の有効性が示された。
論文参考訳（メタデータ） (2020-07-02T10:48:42Z)
The generalization error of max-margin linear classifiers: Benign overfitting and high dimensional asymptotics in the overparametrized regime [11.252856459394854]
現代の機械学習分類器は、トレーニングセットに消滅する分類誤差を示すことが多い。これらの現象に触発され、線形分離可能なデータに対する高次元の最大マージン分類を再検討する。
論文参考訳（メタデータ） (2019-11-05T00:15:27Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。