Fugu-MT 論文翻訳(概要): Beyond Worst-Case Dimensionality Reduction for Sparse Vectors

論文の概要: Beyond Worst-Case Dimensionality Reduction for Sparse Vectors

arxiv url: http://arxiv.org/abs/2502.19865v1
Date: Thu, 27 Feb 2025 08:17:47 GMT
ステータス: 翻訳完了
システム内更新日: 2025-02-28 15:15:46.842206
Title: Beyond Worst-Case Dimensionality Reduction for Sparse Vectors
Title（参考訳）: スパースベクトルの極大次元化
Authors: Sandeep Silwal, David P. Woodruff, Qiuyi Zhang,
Abstract要約: 我々は、$s$sparseベクトルの最低ケース次元削減を超越して研究する。任意の集合 $X$ of $s$-sparse vectors in $mathbbRO(s2)$ に対して、$mathbbRO(s2)$ への線型写像が存在し、任意の $ell_p$ ノルムにおいて$X$の99%のベクトルのノルムを正確に保存する。我々は、$f$の非線形性と$の非負性の両方を示す。
参考スコア（独自算出の注目度）: 47.927989749887864
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We study beyond worst-case dimensionality reduction for $s$-sparse vectors. Our work is divided into two parts, each focusing on a different facet of beyond worst-case analysis: We first consider average-case guarantees. A folklore upper bound based on the birthday-paradox states: For any collection $X$ of $s$-sparse vectors in $\mathbb{R}^d$, there exists a linear map to $\mathbb{R}^{O(s^2)}$ which \emph{exactly} preserves the norm of $99\%$ of the vectors in $X$ in any $\ell_p$ norm (as opposed to the usual setting where guarantees hold for all vectors). We give lower bounds showing that this is indeed optimal in many settings: any oblivious linear map satisfying similar average-case guarantees must map to $\Omega(s^2)$ dimensions. The same lower bound also holds for a wide class of smooth maps, including `encoder-decoder schemes', where we compare the norm of the original vector to that of a smooth function of the embedding. These lower bounds reveal a separation result, as an upper bound of $O(s \log(d))$ is possible if we instead use arbitrary (possibly non-smooth) functions, e.g., via compressed sensing algorithms. Given these lower bounds, we specialize to sparse \emph{non-negative} vectors. For a dataset $X$ of non-negative $s$-sparse vectors and any $p \ge 1$, we can non-linearly embed $X$ to $O(s\log(|X|s)/\epsilon^2)$ dimensions while preserving all pairwise distances in $\ell_p$ norm up to $1\pm \epsilon$, with no dependence on $p$. Surprisingly, the non-negativity assumption enables much smaller embeddings than arbitrary sparse vectors, where the best known bounds suffer exponential dependence. Our map also guarantees \emph{exact} dimensionality reduction for $\ell_{\infty}$ by embedding into $O(s\log |X|)$ dimensions, which is tight. We show that both the non-linearity of $f$ and the non-negativity of $X$ are necessary, and provide downstream algorithmic improvements.
Abstract（参考訳）: 我々は、$s$sparseベクトルの最低ケース次元削減を超越して研究する。私たちの仕事は2つの部分に分かれており、それぞれが最悪のケース分析を越えて異なる側面に注目しています。任意のコレクション $X$ of $s$-sparse vectors in $\mathbb{R}^d$ に対して、$\mathbb{R}^{O(s^2)}$ への線型写像が存在する。同様の平均ケースの保証を満たす不愉快な線型写像は、$\Omega(s^2)$次元にマップしなければならない。同じ下界は、 'encoder-decoder schemes' を含む幅広い滑らかな写像のクラスにも成り立ち、元のベクトルのノルムと埋め込みの滑らかな関数のノルムを比較する。これらの下界は分離結果を示し、代わりに圧縮されたセンシングアルゴリズムを通じて任意の(おそらく非滑らかな)関数(例えば、g)を使う場合、$O(s \log(d))$の上界が可能である。これらの下界が与えられると、スパース \emph{non- negative} ベクトルを特殊化する。非負の$s$-sparseベクトルのデータセット$X$と任意の$p \ge 1$に対して、$X$を$O(s\log(|X|s)/\epsilon^2)$次元に非線形に埋め込み、$\ell_p$ノルムですべてのペア距離を$1\pm \epsilon$に保存し、$p$に依存しない。驚くべきことに、非負性仮定は任意のスパースベクトルよりもはるかに小さな埋め込みを可能にする。我々の写像はまた、$O(s\log |X|)$次元に埋め込むことで$\ell_{\infty}$に対して \emph{exact} 次元の減少を保証し、これは厳密である。我々は、$f$の非線形性と$X$の非負性の両方が必要であることを示し、下流アルゴリズムの改善を提供する。

論文の概要: Beyond Worst-Case Dimensionality Reduction for Sparse Vectors

関連論文リスト