Fugu-MT 論文翻訳(概要): Subspace approximation with outliers

論文の概要: Subspace approximation with outliers

arxiv url: http://arxiv.org/abs/2006.16573v1
Date: Tue, 30 Jun 2020 07:22:33 GMT
ステータス: 翻訳完了
システム内更新日: 2022-11-15 06:25:31.825575
Title: Subspace approximation with outliers
Title（参考訳）: 外れ値を持つ部分空間近似
Authors: Amit Deshpande and Rameshwar Pratap
Abstract要約: 本稿では, オフリヤを用いた部分空間近似問題に対するサンプリングに基づいて, 次元削減手法と双基準近似を拡張する方法を示す。我々の結果は、0 delta leq 1 - α$ が満たされる条件が満たされる限り、alpha$ が大きければ成り立つ。
参考スコア（独自算出の注目度）: 6.186553186139257
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The subspace approximation problem with outliers, for given $n$ points in $d$ dimensions $x_{1},\ldots, x_{n} \in R^{d}$, an integer $1 \leq k \leq d$, and an outlier parameter $0 \leq \alpha \leq 1$, is to find a $k$-dimensional linear subspace of $R^{d}$ that minimizes the sum of squared distances to its nearest $(1-\alpha)n$ points. More generally, the $\ell_{p}$ subspace approximation problem with outliers minimizes the sum of $p$-th powers of distances instead of the sum of squared distances. Even the case of robust PCA is non-trivial, and previous work requires additional assumptions on the input. Any multiplicative approximation algorithm for the subspace approximation problem with outliers must solve the robust subspace recovery problem, a special case in which the $(1-\alpha)n$ inliers in the optimal solution are promised to lie exactly on a $k$-dimensional linear subspace. However, robust subspace recovery is Small Set Expansion (SSE)-hard. We show how to extend dimension reduction techniques and bi-criteria approximations based on sampling to the problem of subspace approximation with outliers. To get around the SSE-hardness of robust subspace recovery, we assume that the squared distance error of the optimal $k$-dimensional subspace summed over the optimal $(1-\alpha)n$ inliers is at least $\delta$ times its squared-error summed over all $n$ points, for some $0 < \delta \leq 1 - \alpha$. With this assumption, we give an efficient algorithm to find a subset of $poly(k/\epsilon) \log(1/\delta) \log\log(1/\delta)$ points whose span contains a $k$-dimensional subspace that gives a multiplicative $(1+\epsilon)$-approximation to the optimal solution. The running time of our algorithm is linear in $n$ and $d$. Interestingly, our results hold even when the fraction of outliers $\alpha$ is large, as long as the obvious condition $0 < \delta \leq 1 - \alpha$ is satisfied.
Abstract（参考訳）: 例えば、$d$次元の$x_{1},\ldots, x_{n} \in R^{d}$, an integer $1 \leq k \leq d$, and a outlier parameter $0 \leq \alpha \leq 1$, for given $n$ points in $d$ dimensions $x_{1},\ldots, x_{n} \in R^{d}$, an outlier parameter $0 \leq \alpha \leq 1$, to find $k$-dimensional linear subspace of $R^{d}$は、最も近い$(1-\alpha)n$点への平方距離の和を最小化する。より一般に、外接点を持つ$\ell_{p}$ 部分空間近似問題は、平方距離の和ではなく、距離の$p$-次元の和を最小化する。堅牢なPCAの場合でさえ非自明であり、以前の作業は入力に対して追加の仮定を必要とする。外れ値を持つ部分空間近似問題に対する乗法近似アルゴリズムは、ロバストな部分空間回復問題を解かなければならないが、最適解の 1-\alpha)n$ inlier がちょうど$k$次元の線型部分空間上にあることを約束する特別な場合である。しかし、ロバストなサブスペースリカバリはSmall Set Expansion (SSE)-hardである。そこで本研究では,部分空間近似問題に対するサンプリングに基づく次元縮小法と双クリップ近似法について述べる。ロバストな部分空間復元のsse-ハードネスを回避するために、最適な$(1-\alpha)n$イリアー上で和される最適な$k$-次元部分空間の2乗距離誤差は、いくつかの$0 < \delta \leq 1\alpha$ に対して、少なくとも$\delta$倍の2乗誤差であると仮定する。この仮定により、最適解に対する乗法 $(1+\epsilon)$近似を与える$k$-次元部分空間を含むスパンを持つ $poly(k/\epsilon) \log(1/\delta) \log\log(1/\delta)$point の部分集合を見つける効率的なアルゴリズムを与える。我々のアルゴリズムの実行時間は$n$と$d$で線形である。興味深いことに、この結果は、明らかな条件 $0 < \delta \leq 1 - \alpha$ が満たされる限り、外れ値 $\alpha$ の比率が大きい場合にも成り立つ。

関連論文リスト

Guessing Efficiently for Constrained Subspace Approximation [49.83981776254246]
制約付き部分空間近似のための一般的なフレームワークを導入する。分割制約付き部分空間近似のための新しいアルゴリズムを$k$-meansクラスタリングに適用し、非負行列分解を投影する。
論文参考訳（メタデータ） (2025-04-29T15:56:48Z)
Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
線形スケッチを用いた行列とベクトルノルムの残差誤差推定問題について検討する。これは、前作とほぼ同じスケッチサイズと精度で、経験的にかなり有利であることを示す。また、スパースリカバリ問題に対して$Omega(k2/pn1-2/p)$低いバウンダリを示し、これは$mathrmpoly(log n)$ factorまで厳密である。
論文参考訳（メタデータ） (2024-08-16T02:33:07Z)
Nearly Linear Sparsification of $\ell_p$ Subspace Approximation [47.790126028106734]
$ell_p$ 部分空間近似問題のNP硬度に対処する一般的なアプローチは、強いコアセットを計算することである。我々は、$ell_p$部分空間近似に対して、ランクパラメータ$k$にほぼ最適に依存する強いコアセットを構築するための最初のアルゴリズムを得る。我々の手法は、オフライン設定と同様のバウンダリを持つ$ell_p$サブスペース近似のための、ほぼ最適に近いオンライン強力なコアセットにも繋がる。
論文参考訳（メタデータ） (2024-07-03T16:49:28Z)
Optimal Embedding Dimension for Sparse Subspace Embeddings [4.042707434058959]
ランダム$mtimes n$ matrix $S$は、忘れられない部分空間埋め込み(OSE)である。 mtimes n$ random matrix $S$ with $mgeq (1+theta)d$ is an oblivious subspace embedding with $epsilon = O_theta(1)$。これを使用すれば、現在の行列乗算時間よりも早く適用できる$O(d)$埋め込み次元で、最初の難解な部分空間埋め込みを構築することができる。
論文参考訳（メタデータ） (2023-11-17T18:01:58Z)
Sketching Algorithms and Lower Bounds for Ridge Regression [65.0720777731368]
リッジ回帰問題に対する1+varepsilon$近似解を計算するスケッチベース反復アルゴリズムを提案する。また,このアルゴリズムがカーネルリッジ回帰の高速化に有効であることを示す。
論文参考訳（メタデータ） (2022-04-13T22:18:47Z)
Low-Rank Approximation with $1/\epsilon^{1/3}$ Matrix-Vector Products [58.05771390012827]
我々は、任意のSchatten-$p$ノルムの下で、低ランク近似のためのクリロフ部分空間に基づく反復法について研究する。我々の主な成果は、$tildeO(k/sqrtepsilon)$ matrix-vector productのみを使用するアルゴリズムである。
論文参考訳（メタデータ） (2022-02-10T16:10:41Z)
Active Sampling for Linear Regression Beyond the $\ell_2$ Norm [70.49273459706546]
対象ベクトルの少数のエントリのみを問合せすることを目的とした線形回帰のためのアクティブサンプリングアルゴリズムについて検討する。我々はこの$d$への依存が対数的要因まで最適であることを示す。また、損失関数に対して最初の全感度上界$O(dmax1,p/2log2 n)$を提供し、最大で$p$成長する。
論文参考訳（メタデータ） (2021-11-09T00:20:01Z)
Nonconvex-Nonconcave Min-Max Optimization with a Small Maximization Domain [11.562923882714093]
Y f(x,y) における max_y の $min_x 形式の最適化問題における近似一階定常点の探索問題について検討する。我々のアプローチは、関数 $f(x,cdot)$ を $k 次テイラー近似($y$ で)に置き換え、ほぼ定常点を $Y$ で見つけることに依存する。
論文参考訳（メタデータ） (2021-10-08T07:46:18Z)
Approximate Maximum Halfspace Discrepancy [6.35821487778241]
範囲空間 $(X, MathcalH_d)$ ここで、$X の部分集合 mathbbRd$ と $mathcalH_d$ は、$d$ハーフスペースで定義される範囲の集合である。数学 H_d$ における各半空間 $h に対して、$Phi(h)$ は、赤の分数と青の点の分数の差を測る関数 $Phi(h)$ を定義する。
論文参考訳（メタデータ） (2021-06-25T19:14:45Z)
Locally Private $k$-Means Clustering with Constant Multiplicative Approximation and Near-Optimal Additive Error [10.632986841188]
2つの新しいアルゴリズムで加算誤差の上と下の境界における$n$の指数のギャップを埋める。局所的にプライベートな$k$-meansの問題を、定数係数乗算近似を持つ一定数のラウンドで解くことができる。
論文参考訳（メタデータ） (2021-05-31T14:41:40Z)
Average Case Column Subset Selection for Entrywise $\ell_1$-Norm Loss [76.02734481158458]
最悪の場合、行列に対する良いランク-$k$近似を得るには、任意に大きい$nOmega(1)$列数が必要であることが知られている。最小かつ現実的な分布設定では、ほぼ線形な実行時間を持つ$(k/epsilon)$-approximationとpoly$(k/epsilon)+O(klog n)$ columnsが得られる。これは、エントリワイズで$(k/epsilon)$-approximationを達成するための任意の種類の最初のアルゴリズムである
論文参考訳（メタデータ） (2020-04-16T22:57:06Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。