Fugu-MT 論文翻訳(概要): Consistent regression when oblivious outliers overwhelm

論文の概要: Consistent regression when oblivious outliers overwhelm

arxiv url: http://arxiv.org/abs/2009.14774v2
Date: Tue, 25 May 2021 17:20:11 GMT
ステータス: 翻訳完了
システム内更新日: 2022-10-12 22:53:42.361656
Title: Consistent regression when oblivious outliers overwhelm
Title（参考訳）: 閉塞性外乱が圧倒した場合の連続回帰
Authors: Tommaso d'Orsi, Gleb Novikov, David Steurer
Abstract要約: 我々の研究に先立ち、ガウスの$X$でさえ、$beta*$ の見積子は、このモデルでは一貫性がないことが知られていた。ほぼ線形なサンプルサイズと逆ポリノミアル不整分率で一貫した推定が可能であることを示す。ここで研究したモデルは、最初の瞬間さえも持たない重い尾の雑音の分布も捉えている。
参考スコア（独自算出の注目度）: 8.873449722727026
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider a robust linear regression model $y=X\beta^* + \eta$, where an adversary oblivious to the design $X\in \mathbb{R}^{n\times d}$ may choose $\eta$ to corrupt all but an $\alpha$ fraction of the observations $y$ in an arbitrary way. Prior to our work, even for Gaussian $X$, no estimator for $\beta^*$ was known to be consistent in this model except for quadratic sample size $n \gtrsim (d/\alpha)^2$ or for logarithmic inlier fraction $\alpha\ge 1/\log n$. We show that consistent estimation is possible with nearly linear sample size and inverse-polynomial inlier fraction. Concretely, we show that the Huber loss estimator is consistent for every sample size $n= \omega(d/\alpha^2)$ and achieves an error rate of $O(d/\alpha^2n)^{1/2}$. Both bounds are optimal (up to constant factors). Our results extend to designs far beyond the Gaussian case and only require the column span of $X$ to not contain approximately sparse vectors). (similar to the kind of assumption commonly made about the kernel space for compressed sensing). We provide two technically similar proofs. One proof is phrased in terms of strong convexity, extending work of [Tsakonas et al.'14], and particularly short. The other proof highlights a connection between the Huber loss estimator and high-dimensional median computations. In the special case of Gaussian designs, this connection leads us to a strikingly simple algorithm based on computing coordinate-wise medians that achieves optimal guarantees in nearly-linear time, and that can exploit sparsity of $\beta^*$. The model studied here also captures heavy-tailed noise distributions that may not even have a first moment.
Abstract（参考訳）: 頑健な線形回帰モデル $y=X\beta^* + \eta$ を考えると、設計に不利な逆数 $X\in \mathbb{R}^{n\times d}$ は、全てを汚すために$\eta$ を選ぶが、観測の$y$ の分数$\alpha$ は任意の方法で選ぶことができる。我々の研究に先立ち、ガウスの$X$であっても、$\beta^*$ に対する推定子は2次サンプルサイズ $n \gtrsim (d/\alpha)^2$ や対数帰納率 $\alpha\ge 1/\log n$ を除いては、このモデルでは一貫性がないことが知られている。ほぼ線形なサンプルサイズと逆多項不連続分数で一貫した推定が可能であることを示す。具体的には、ハマー損失推定器は、すべてのサンプルサイズ$n= \omega(d/\alpha^2)$に対して整合性を示し、誤り率$O(d/\alpha^2n)^{1/2}$を達成する。どちらの境界も最適である(定数因子まで)。我々の結果はガウスのケースをはるかに超える設計に拡張され、約スパースベクトルを含まないために$X$の列スパンしか必要としない。 (圧縮センシングのカーネル空間に関する一般的な仮定と似ている)。技術的に類似した2つの証明を提供する。 1つの証明は、強い凸性、 [tsakonas et al.'14] の拡張、特に短いという観点で表現される。もう1つの証明は、ハマー損失推定器と高次元中央値計算との接続を強調している。ガウス設計の特別な場合、この接続は、ほぼ線形な時間に最適な保証を達成し、$\beta^*$のスパーシティを活用できる座標的中央値計算に基づく非常に単純なアルゴリズムへと導かれる。ここで研究したモデルは、最初の瞬間さえ持たない重い尾のノイズ分布も捉えている。

関連論文リスト

Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
以前の$mathRd上の分布に関する民間推定者は、次元性の呪いに苦しむ。本稿では,サンプルの複雑さが次元依存性を改善したアルゴリズムを提案する。
論文参考訳（メタデータ） (2024-11-01T17:59:53Z)
Better and Simpler Lower Bounds for Differentially Private Statistical Estimation [7.693388437377614]
任意の$alpha le O(1)$に対して、ガウスの共分散をスペクトル誤差まで推定するには$tildeOmegaleft(fracd3/2alpha varepsilon + fracdalpha2right)$サンプルが必要である。次に、有界な$k$thモーメントで重み付き分布の平均を推定するには$tildeOmegaleft(fracdalphak/(k-1) varepsilon +
論文参考訳（メタデータ） (2023-10-10T04:02:43Z)
A Unified Framework for Uniform Signal Recovery in Nonlinear Generative Compressed Sensing [68.80803866919123]
非線形測定では、ほとんどの先行結果は一様ではない、すなわち、すべての$mathbfx*$に対してではなく、固定された$mathbfx*$に対して高い確率で保持される。本フレームワークはGCSに1ビット/一様量子化観測と単一インデックスモデルを標準例として適用する。また、指標集合が計量エントロピーが低い製品プロセスに対して、より厳密な境界を生み出す濃度不等式も開発する。
論文参考訳（メタデータ） (2023-09-25T17:54:19Z)
Learning a Single Neuron with Adversarial Label Noise via Gradient Descent [50.659479930171585]
モノトン活性化に対する $mathbfxmapstosigma(mathbfwcdotmathbfx)$ の関数について検討する。学習者の目標は仮説ベクトル $mathbfw$ that $F(mathbbw)=C, epsilon$ を高い確率で出力することである。
論文参考訳（メタデータ） (2022-06-17T17:55:43Z)
Mean Estimation in High-Dimensional Binary Markov Gaussian Mixture Models [12.746888269949407]
2進隠れマルコフモデルに対する高次元平均推定問題を考える。ほぼ最小限の誤差率(対数係数まで)を $|theta_*|,delta,d,n$ の関数として確立する。
論文参考訳（メタデータ） (2022-06-06T09:34:04Z)
Robust Testing in High-Dimensional Sparse Models [0.0]
2つの異なる観測モデルの下で高次元スパース信号ベクトルのノルムを頑健にテストする問題を考察する。回帰係数のノルムを確実に検定するアルゴリズムは、少なくとも$n=Omegaleft(min(slog d,1/gamma4)right)サンプルを必要とする。
論文参考訳（メタデータ） (2022-05-16T07:47:22Z)
Structure Learning in Graphical Models from Indirect Observations [17.521712510832558]
本稿では、パラメータ法と非パラメトリック法の両方を用いて、Rp$における$p$次元ランダムベクトル$Xのグラフィカル構造を学習する。温和な条件下では、グラフ構造推定器が正しい構造を得ることができることを示す。
論文参考訳（メタデータ） (2022-05-06T19:24:44Z)
Consistent Estimation for PCA and Sparse Regression with Oblivious Outliers [13.244654316770815]
我々は効率よく計算可能で一貫した推定器を設計する機械を開発する。スパース回帰では、最適なサンプルサイズ$ngsim (klog d)/alpha2$の整合性を達成する。 PCAの文脈では、パラメータ行列上の広いスパイキネス仮定の下で最適な誤差を保証する。
論文参考訳（メタデータ） (2021-11-04T15:59:44Z)
Optimal Mean Estimation without a Variance [103.26777953032537]
本研究では,データ生成分布の分散が存在しない環境での重み付き平均推定問題について検討する。最小の信頼区間を$n,d,delta$の関数として得る推定器を設計する。
論文参考訳（メタデータ） (2020-11-24T22:39:21Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
学習者が生成モデル$Y = langle X,w* rangle + epsilon$から$n$のサンプルにアクセスできるような高次元頑健な線形回帰問題について検討する。 i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance, (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
論文参考訳（メタデータ） (2020-07-16T06:44:44Z)
Curse of Dimensionality on Randomized Smoothing for Certifiable Robustness [151.67113334248464]
我々は、他の攻撃モデルに対してスムースな手法を拡張することは困難であることを示す。我々はCIFARに関する実験結果を示し,その理論を検証した。
論文参考訳（メタデータ） (2020-02-08T22:02:14Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。