Fugu-MT 論文翻訳(概要): Sub-quadratic Algorithms for Kernel Matrices via Kernel Density Estimation

論文の概要: Sub-quadratic Algorithms for Kernel Matrices via Kernel Density Estimation

arxiv url: http://arxiv.org/abs/2212.00642v1
Date: Thu, 1 Dec 2022 16:42:56 GMT
ステータス: 翻訳完了
システム内更新日: 2022-12-02 17:43:34.065056
Title: Sub-quadratic Algorithms for Kernel Matrices via Kernel Density Estimation
Title（参考訳）: カーネル密度推定によるカーネル行列のサブ量子アルゴリズム
Authors: Ainesh Bakshi, Piotr Indyk, Praneeth Kacham, Sandeep Silwal and Samson Zhou
Abstract要約: カーネルグラフ上では$textitweighted edge sample$、カーネルグラフ上では$textitweighted walk$、行列で$textitweighted sample$からKernel Density Estimationへ効率よく還元する。当社の削減は、それぞれのアプリケーションにおいて中心的な要素であり、それらが独立した関心事である可能性があると信じています。
参考スコア（独自算出の注目度）: 24.166833799353476
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Kernel matrices, as well as weighted graphs represented by them, are ubiquitous objects in machine learning, statistics and other related fields. The main drawback of using kernel methods (learning and inference using kernel matrices) is efficiency -- given $n$ input points, most kernel-based algorithms need to materialize the full $n \times n$ kernel matrix before performing any subsequent computation, thus incurring $\Omega(n^2)$ runtime. Breaking this quadratic barrier for various problems has therefore, been a subject of extensive research efforts. We break the quadratic barrier and obtain $\textit{subquadratic}$ time algorithms for several fundamental linear-algebraic and graph processing primitives, including approximating the top eigenvalue and eigenvector, spectral sparsification, solving linear systems, local clustering, low-rank approximation, arboricity estimation and counting weighted triangles. We build on the recent Kernel Density Estimation framework, which (after preprocessing in time subquadratic in $n$) can return estimates of row/column sums of the kernel matrix. In particular, we develop efficient reductions from $\textit{weighted vertex}$ and $\textit{weighted edge sampling}$ on kernel graphs, $\textit{simulating random walks}$ on kernel graphs, and $\textit{importance sampling}$ on matrices to Kernel Density Estimation and show that we can generate samples from these distributions in $\textit{sublinear}$ (in the support of the distribution) time. Our reductions are the central ingredient in each of our applications and we believe they may be of independent interest. We empirically demonstrate the efficacy of our algorithms on low-rank approximation (LRA) and spectral sparsification, where we observe a $\textbf{9x}$ decrease in the number of kernel evaluations over baselines for LRA and a $\textbf{41x}$ reduction in the graph size for spectral sparsification.
Abstract（参考訳）: カーネル行列は、それらで表される重み付きグラフと同様に、機械学習、統計、その他の関連分野においてユビキタスなオブジェクトである。カーネルメソッド(カーネル行列を用いた学習と推論)を使用する主な欠点は効率である。$n$の入力ポイントが与えられた場合、ほとんどのカーネルベースのアルゴリズムは、その後の計算を実行する前にフル$n \times n$のカーネル行列を実体化する必要がある。そのため、この二次障壁を突破することは広範な研究の課題となっている。二次障壁を破って、いくつかの基本線形代数およびグラフ処理プリミティブに対する$\textit{subquadratic}$時間アルゴリズムを得る。例えば、トップ固有値および固有ベクトルの近似、スペクトルスパーシフィケーション、線形系を解くこと、局所クラスタリング、低ランク近似、アルボリシティ推定、重み付き三角形の計数などである。最近のカーネル密度推定フレームワークに基づいて構築し、(n$の時間的サブクアドラティックな前処理の後)カーネルマトリックスの行/カラム和の見積もりを返すことができる。特に、$\textit{weighted vertex}$および$\textit{weighted edge sample}$ on kernel graphs, $\textit{simulating random walk}$ on kernel graphs, $\textit{importance sample}$ on matrices to Kernel Density Estimation から$\textit{sublinear}$(分布のサポート)時間でこれらの分布からサンプルを生成することができることを示す。私たちの還元は、それぞれのアプリケーションにおいて中心的な要素であり、それらが独立した関心事であると信じています。低ランク近似(LRA)とスペクトルスペーシフィケーション(スペクトルスペーシフィケーション)に対するアルゴリズムの有効性を実証的に実証し、LRAのベースラインよりもカーネル評価が減少する$\textbf{9x}$とスペクトルスペーシフィケーションのためのグラフサイズが減少する$\textbf{41x}$を観察した。

論文の概要: Sub-quadratic Algorithms for Kernel Matrices via Kernel Density Estimation

関連論文リスト