Fugu-MT 論文翻訳(概要): The generalization error of max-margin linear classifiers: Benign overfitting and high dimensional asymptotics in the overparametrized regime

論文の概要: The generalization error of max-margin linear classifiers: Benign overfitting and high dimensional asymptotics in the overparametrized regime

arxiv url: http://arxiv.org/abs/1911.01544v3
Date: Wed, 22 Mar 2023 16:53:25 GMT
ステータス: 翻訳完了
システム内更新日: 2023-03-24 08:50:00.748729
Title: The generalization error of max-margin linear classifiers: Benign overfitting and high dimensional asymptotics in the overparametrized regime
Title（参考訳）: 最大マージン線形分類器の一般化誤差:過パラメトリケート状態における良性オーバーフィットと高次元漸近
Authors: Andrea Montanari, Feng Ruan, Youngtak Sohn, Jun Yan
Abstract要約: 現代の機械学習分類器は、トレーニングセットに消滅する分類誤差を示すことが多い。これらの現象に触発され、線形分離可能なデータに対する高次元の最大マージン分類を再検討する。
参考スコア（独自算出の注目度）: 11.252856459394854
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern machine learning classifiers often exhibit vanishing classification error on the training set. They achieve this by learning nonlinear representations of the inputs that maps the data into linearly separable classes. Motivated by these phenomena, we revisit high-dimensional maximum margin classification for linearly separable data. We consider a stylized setting in which data $(y_i,{\boldsymbol x}_i)$, $i\le n$ are i.i.d. with ${\boldsymbol x}_i\sim\mathsf{N}({\boldsymbol 0},{\boldsymbol \Sigma})$ a $p$-dimensional Gaussian feature vector, and $y_i \in\{+1,-1\}$ a label whose distribution depends on a linear combination of the covariates $\langle {\boldsymbol \theta}_*,{\boldsymbol x}_i \rangle$. While the Gaussian model might appear extremely simplistic, universality arguments can be used to show that the results derived in this setting also apply to the output of certain nonlinear featurization maps. We consider the proportional asymptotics $n,p\to\infty$ with $p/n\to \psi$, and derive exact expressions for the limiting generalization error. We use this theory to derive two results of independent interest: $(i)$ Sufficient conditions on $({\boldsymbol \Sigma},{\boldsymbol \theta}_*)$ for `benign overfitting' that parallel previously derived conditions in the case of linear regression; $(ii)$ An asymptotically exact expression for the generalization error when max-margin classification is used in conjunction with feature vectors produced by random one-layer neural networks.
Abstract（参考訳）: 現代の機械学習分類器は、しばしばトレーニングセット上で消滅する分類エラーを示す。彼らはデータを線形分離可能なクラスにマッピングする入力の非線形表現を学習することでこれを実現できる。これらの現象に動機づけられ,線形分離データに対する高次元最大マージン分類を再考する。我々は、データ $(y_i,{\boldsymbol x}_i)$, $i\le n$ が i.i.d. with ${\boldsymbol x}_i\sim\mathsf{n}({\boldsymbol 0},{\boldsymbol \sigma})$ a $p$-dimensional gaussian feature vector, $y_i \in\{+1,-1\}$ である定式化集合を考える。ガウス模型は極端に単純に見えるかもしれないが、普遍性論証は、この設定から導かれた結果がある種の非線形分解写像の出力にも適用可能であることを示すために用いられる。比例漸近の $n,p\to\infty$ を $p/n\to \psi$ とみなし、極限一般化誤差の正確な式を導出する。この理論は、独立利害の2つの結果を引き出すのに使われます。 (i)$ ({\boldsymbol \sigma},{\boldsymbol \theta}_*)$ for 'benign overfitting' の条件は、線形回帰の場合、既に導出されていた条件である。 (ii)$ ランダムな一層ニューラルネットワークによって生成される特徴ベクトルとマックスマージン分類を用いる場合の一般化誤差の漸近的厳密な表現。

関連論文リスト

Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
単一インデックス対象関数 $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$ の勾配勾配勾配学習問題について検討する。 SGDに基づくアルゴリズムにより最適化された2層ニューラルネットワークは、情報指数に支配されない複雑さで$f_*$を学習する。
論文参考訳（メタデータ） (2024-06-03T17:56:58Z)
Near-Interpolators: Rapid Norm Growth and the Trade-Off between Interpolation and Generalization [28.02367842438021]
ほぼ補間された線形回帰器の一般化能力について検討する。 for $tau$ fixed, $boldsymbolbeta$ has squared $ell$-norm $bbE[|boldsymbolbeta|_22]. 我々は、同様の現象が、ほぼ補間された浅いニューラルネットワークに現れることを実証的に証明した。
論文参考訳（メタデータ） (2024-03-12T02:47:00Z)
Universality of max-margin classifiers [10.797131009370219]
非ガウス的特徴に対する誤分類誤差の高次元普遍性と大域化写像の役割について検討する。特に、オーバーパラメトリゼーションしきい値と一般化誤差はより単純なモデルで計算できる。
論文参考訳（メタデータ） (2023-09-29T22:45:56Z)
A Unified Framework for Uniform Signal Recovery in Nonlinear Generative Compressed Sensing [68.80803866919123]
非線形測定では、ほとんどの先行結果は一様ではない、すなわち、すべての$mathbfx*$に対してではなく、固定された$mathbfx*$に対して高い確率で保持される。本フレームワークはGCSに1ビット/一様量子化観測と単一インデックスモデルを標準例として適用する。また、指標集合が計量エントロピーが低い製品プロセスに対して、より厳密な境界を生み出す濃度不等式も開発する。
論文参考訳（メタデータ） (2023-09-25T17:54:19Z)
Repeated Observations for Classification [0.2676349883103404]
繰り返し観測を行った結果,非パラメトリック分類の問題について検討した。本分析では, 名目密度によるロバスト検出, プロトタイプ分類, 線形変換, 線形分類, スケーリングなどのモデルについて検討する。
論文参考訳（メタデータ） (2023-07-19T10:50:36Z)
Dimension free ridge regression [10.434481202633458]
我々は、リッジ回帰のバイアスとばらつきの観点から、すなわちデータ上のリッジ回帰を再考し、等価なシーケンスモデルのバイアスとばらつきの観点から、リッジ回帰のバイアスとばらつきを考察する。新しい応用として、定期的に変化するスペクトルを持つヒルベルト共変量に対して、完全に明示的で鋭い尾根回帰特性を得る。
論文参考訳（メタデータ） (2022-10-16T16:01:05Z)
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation [89.21686761957383]
2層ネットワークにおける第1層パラメータ $boldsymbolW$ の勾配降下ステップについて検討した。我々の結果は、一つのステップでもランダムな特徴に対してかなりの優位性が得られることを示した。
論文参考訳（メタデータ） (2022-05-03T12:09:59Z)
$p$-Generalized Probit Regression and Scalable Maximum Likelihood Estimation via Sketching and Coresets [74.37849422071206]
本稿では, 2次応答に対する一般化線形モデルである,$p$一般化プロビット回帰モデルについて検討する。 p$の一般化されたプロビット回帰に対する最大可能性推定器は、大容量データ上で$(1+varepsilon)$の係数まで効率的に近似できることを示す。
論文参考訳（メタデータ） (2022-03-25T10:54:41Z)
Universality of empirical risk minimization [12.764655736673749]
例えば、$boldsymbol x_i inmathbbRp$ が特徴ベクトルで $y in mathbbR$ がラベルであるような i.d. サンプルからの教師付き学習を考える。我々は$mathsfkによってパラメータ化される関数のクラスに対する経験的リスク普遍性について研究する。
論文参考訳（メタデータ） (2022-02-17T18:53:45Z)
Random matrices in service of ML footprint: ternary random features with no performance loss [55.30329197651178]
我々は、$bf K$ の固有スペクトルが$bf w$ の i.d. 成分の分布とは独立であることを示す。 3次ランダム特徴(TRF)と呼ばれる新しいランダム手法を提案する。提案したランダムな特徴の計算には乗算が不要であり、古典的なランダムな特徴に比べてストレージに$b$のコストがかかる。
論文参考訳（メタデータ） (2021-10-05T09:33:49Z)
Self-training Converts Weak Learners to Strong Learners in Mixture Models [86.7137362125503]
擬似ラベルの $boldsymbolbeta_mathrmpl$ が,最大$C_mathrmerr$ の分類誤差を達成可能であることを示す。さらに、ロジスティックな損失に対して勾配降下を実行することで、ラベル付き例のみを使用して、分類誤差が$C_mathrmerr$で擬ラベルの $boldsymbolbeta_mathrmpl$ が得られることを示す。
論文参考訳（メタデータ） (2021-06-25T17:59:16Z)
Optimal Combination of Linear and Spectral Estimators for Generalized Linear Models [59.015960528781115]
最適に $hatboldsymbol xrm L$ と $hatboldsymbol xrm s$ を組み合わせる方法を示す。我々は,$(boldsymbol x, hatboldsymbol xrm L, hatboldsymbol xrm s)$の制限分布を確立するために,Adroximate Message Passing (AMP)アルゴリズムの設計と解析を行う。
論文参考訳（メタデータ） (2020-08-07T18:20:05Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。