Fugu-MT 論文翻訳(概要): Asymptotic Characterisation of Robust Empirical Risk Minimisation Performance in the Presence of Outliers

論文の概要: Asymptotic Characterisation of Robust Empirical Risk Minimisation Performance in the Presence of Outliers

arxiv url: http://arxiv.org/abs/2305.18974v2
Date: Wed, 27 Sep 2023 09:50:48 GMT
ステータス: 翻訳完了
システム内更新日: 2023-09-28 19:21:35.765335
Title: Asymptotic Characterisation of Robust Empirical Risk Minimisation Performance in the Presence of Outliers
Title（参考訳）: 外乱の有無によるロバストな経験的リスク最小化性能の漸近的評価
Authors: Matteo Vilucchio, Emanuele Troiani, Vittorio Erba, Florent Krzakala
Abstract要約: 我々は,次元$d$とデータ点数$n$が固定比$alpha=n/d$で分岐した場合,高次元の線形回帰について検討し,出力率を含むデータモデルについて検討する。我々は、$ell$-regularized $ell$, $ell_$, Huber損失を用いて、経験的リスク最小化(ERM)のパフォーマンスの正確性を提供する。
参考スコア（独自算出の注目度）: 18.455890316339595
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study robust linear regression in high-dimension, when both the dimension $d$ and the number of data points $n$ diverge with a fixed ratio $\alpha=n/d$, and study a data model that includes outliers. We provide exact asymptotics for the performances of the empirical risk minimisation (ERM) using $\ell_2$-regularised $\ell_2$, $\ell_1$, and Huber losses, which are the standard approach to such problems. We focus on two metrics for the performance: the generalisation error to similar datasets with outliers, and the estimation error of the original, unpolluted function. Our results are compared with the information theoretic Bayes-optimal estimation bound. For the generalization error, we find that optimally-regularised ERM is asymptotically consistent in the large sample complexity limit if one perform a simple calibration, and compute the rates of convergence. For the estimation error however, we show that due to a norm calibration mismatch, the consistency of the estimator requires an oracle estimate of the optimal norm, or the presence of a cross-validation set not corrupted by the outliers. We examine in detail how performance depends on the loss function and on the degree of outlier corruption in the training set and identify a region of parameters where the optimal performance of the Huber loss is identical to that of the $\ell_2$ loss, offering insights into the use cases of different loss functions.
Abstract（参考訳）: 次元 $d$ とデータポイント数 $n$ の両方が固定比 $\alpha=n/d$ で分岐する場合、高次元におけるロバストな線形回帰を研究し、外れ値を含むデータモデルを調べる。このような問題に対する標準的なアプローチである$\ell_2$-regularized $\ell_2$,$\ell_1$,およびHuber損失を用いて、経験的リスク最小化(ERM)の性能の正確な漸近を提供する。性能の指標として,異常値を持つ類似データセットに対する一般化誤差と,元の未定関数の推定誤差の2つに注目した。その結果,情報理論ベイズ最適推定値と比較した。一般化誤差の場合、最適な正規化ermは、単純なキャリブレーションを行い、収束率を計算すると、大きなサンプル複雑性限界において漸近的に一致することが分かる。しかし, 推定誤差は, 標準校正ミスマッチのため, 推定器の整合性には最適基準のオラクル推定が必要であること, あるいは, 異常値が不完全でないクロスバリデーションセットの存在が示される。学習セットにおける損失関数と異常破壊の程度にパフォーマンスがどのように依存するかを詳細に検討し,フーバー損失の最適性能が$\ell_2$損失と同一であるパラメータの領域を特定し,異なる損失関数のユースケースに対する洞察を提供する。

関連論文リスト

Distributionally Robust Optimization with Adversarial Data Contamination [49.89480853499918]
凸リプシッツ損失関数を持つ一般化線形モデルに対するワッサーシュタイン-1 DRO 目標の最適化に焦点をあてる。私たちの主な貢献は、データ汚染のトレーニングに対するロバストネスと分散シフトに対するロバストネスを統合した、新しいモデリングフレームワークです。この研究は、データ汚染と分散シフトという2つの課題の下で学習するために、効率的な計算によって支援される最初の厳密な保証を確立する。
論文参考訳（メタデータ） (2025-07-14T18:34:10Z)
Asymptotically Optimal Linear Best Feasible Arm Identification with Fixed Budget [55.938644481736446]
本稿では,誤差確率の指数的減衰を保証し,最適な腕識別のための新しいアルゴリズムを提案する。我々は,複雑性のレベルが異なる様々な問題インスタンスに対する包括的経験的評価を通じて,アルゴリズムの有効性を検証する。
論文参考訳（メタデータ） (2025-06-03T02:56:26Z)
Semiparametric conformal prediction [79.6147286161434]
ベクトル値の非整合性スコアの結合相関構造を考慮した共形予測セットを構築する。スコアの累積分布関数(CDF)を柔軟に推定する。提案手法は,現実の回帰問題に対して,所望のカバレッジと競争効率をもたらす。
論文参考訳（メタデータ） (2024-11-04T14:29:02Z)
Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
機械学習の幅広い問題にまたがる正規化誤差フィードバックアルゴリズムに対する収束の最初の証明を提供する。提案手法では,許容可能なステップサイズが大きくなったため,新しい正規化エラーフィードバックアルゴリズムは,各種タスクにおける非正規化エラーよりも優れていた。
論文参考訳（メタデータ） (2024-10-22T10:19:27Z)
A Statistical Theory of Regularization-Based Continual Learning [10.899175512941053]
線形回帰タスクの順序に基づく正規化に基づく連続学習の統計的解析を行う。まず、全てのデータが同時に利用可能であるかのように得られたオラクル推定器の収束率を導出する。理論解析の副産物は、早期停止と一般化された$ell$-regularizationの等価性である。
論文参考訳（メタデータ） (2024-06-10T12:25:13Z)
Orthogonal Causal Calibration [55.28164682911196]
我々は、任意の損失$ell$に対して、任意の因果パラメータのキャリブレーション誤差$theta$の一般的な上限を証明した。我々は、因果校正のための2つのサンプル分割アルゴリズムの収束解析に境界を用いる。
論文参考訳（メタデータ） (2024-06-04T03:35:25Z)
Optimal convex $M$-estimation via score matching [6.115859302936817]
実験的リスク最小化が回帰係数の下流推定における最適分散をもたらすデータ駆動凸損失関数を構築した。半パラメトリック手法は、雑音分布の対数密度の導関数の導関数の最も少ない近似を目標とする。
論文参考訳（メタデータ） (2024-03-25T12:23:19Z)
On the Performance of Empirical Risk Minimization with Smoothed Data [59.3428024282545]
経験的リスク最小化(Empirical Risk Minimization、ERM)は、クラスがiidデータで学習可能であれば、サブ線形誤差を達成できる。 We show that ERM can able to achieve sublinear error when a class are learnable with iid data。
論文参考訳（メタデータ） (2024-02-22T21:55:41Z)
The Adaptive $τ$-Lasso: Robustness and Oracle Properties [12.06248959194646]
本稿では,高次元データセット解析のためのロバストな$tau$-regression推定器の正規化版を紹介する。得られた推定器はアダプティブ $tau$-Lasso と呼ばれ、外れ値や高平均点に対して堅牢である。外れ値と高平均点に直面して、適応 $tau$-Lasso と $tau$-Lasso 推定器は、最高のパフォーマンスまたは最も近いパフォーマンスを達成する。
論文参考訳（メタデータ） (2023-04-18T21:34:14Z)
A Huber loss-based super learner with applications to healthcare expenditures [0.0]
本稿では,2乗誤差損失と絶対損失とを結合した「ロバスト」損失関数であるHuber損失に基づく超学習者を提案する。提案手法は,ハマーリスクの最適化だけでなく,有限サンプル設定でも直接利用できることを示す。
論文参考訳（メタデータ） (2022-05-13T19:57:50Z)
On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
勾配降下(SGD)により最適化された高次元におけるランダム特徴(RF)回帰特性について検討する。本研究では, RF回帰の高精度な非漸近誤差境界を, 定常および適応的なステップサイズSGD設定の下で導出する。理論的にも経験的にも二重降下現象を観察する。
論文参考訳（メタデータ） (2021-10-13T17:47:39Z)
Robust Algorithms for GMM Estimation: A Finite Sample Viewpoint [30.839245814393724]
モーメントの一般化法(GMM) 我々はGMM推定器を開発し、一定の$ell$リカバリ保証を$O(sqrtepsilon)$で許容する。我々のアルゴリズムと仮定は、機器変数の線形回帰とロジスティック回帰に適用できる。
論文参考訳（メタデータ） (2021-10-06T21:06:22Z)
SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression [68.66245730450915]
実用データセットに対する予測の偏見を回避し、頻繁な不確実性を推定する改善された手法を開発している。私たちの主な貢献は、推定と推論の計算時間をマグニチュードの順序で短縮する収束保証付き信号強度の推定器SLOEです。
論文参考訳（メタデータ） (2021-03-23T17:48:56Z)
Evaluating representations by the complexity of learning low-loss predictors [55.94170724668857]
下流タスクの解決に使用されるデータの表現を評価することの問題点を考察する。本稿では,関心のあるタスクにおける低損失を実現する表現の上に,予測器を学習する複雑性によって表現の質を測定することを提案する。
論文参考訳（メタデータ） (2020-09-15T22:06:58Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。