Fugu-MT 論文翻訳(概要): Linear Time Sinkhorn Divergences using Positive Features

論文の概要: Linear Time Sinkhorn Divergences using Positive Features

arxiv url: http://arxiv.org/abs/2006.07057v3
Date: Mon, 26 Oct 2020 07:55:17 GMT
ステータス: 翻訳完了
システム内更新日: 2022-11-22 02:39:05.320397
Title: Linear Time Sinkhorn Divergences using Positive Features
Title（参考訳）: 正の特徴を用いた線形時間シンクホーン分岐
Authors: Meyer Scetbon and Marco Cuturi
Abstract要約: エントロピー正則化で最適な輸送を解くには、ベクトルに繰り返し適用される$ntimes n$ kernel matrixを計算する必要がある。代わりに、$c(x,y)=-logdotpvarphi(x)varphi(y)$ ここで$varphi$は、地上空間から正のorthant $RRr_+$への写像であり、$rll n$である。
参考スコア（独自算出の注目度）: 51.50788603386766
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although Sinkhorn divergences are now routinely used in data sciences to compare probability distributions, the computational effort required to compute them remains expensive, growing in general quadratically in the size $n$ of the support of these distributions. Indeed, solving optimal transport (OT) with an entropic regularization requires computing a $n\times n$ kernel matrix (the neg-exponential of a $n\times n$ pairwise ground cost matrix) that is repeatedly applied to a vector. We propose to use instead ground costs of the form $c(x,y)=-\log\dotp{\varphi(x)}{\varphi(y)}$ where $\varphi$ is a map from the ground space onto the positive orthant $\RR^r_+$, with $r\ll n$. This choice yields, equivalently, a kernel $k(x,y)=\dotp{\varphi(x)}{\varphi(y)}$, and ensures that the cost of Sinkhorn iterations scales as $O(nr)$. We show that usual cost functions can be approximated using this form. Additionaly, we take advantage of the fact that our approach yields approximation that remain fully differentiable with respect to input distributions, as opposed to previously proposed adaptive low-rank approximations of the kernel matrix, to train a faster variant of OT-GAN \cite{salimans2018improving}.
Abstract（参考訳）: シンクホーンのダイバージェンスは現在、確率分布を比較するためにデータサイエンスで日常的に使われているが、それらの計算に必要な計算労力は依然として高価であり、これらの分布のサポートの2倍の大きさで成長している。実際、エントロピー正規化による最適輸送(ot)の解くには、ベクトルに繰り返し適用される$n\times n$ kernel matrix($n\times n$ pairwise ground cost matrixの neg-exponential)を計算する必要がある。代わりに$c(x,y)=-\log\dotp{\varphi という形式の地上費用を使う。 (x)}{\varphi (y)}$ where $\varphi$ は、地上空間から正のorthant $\RR^r_+$ への写像で、$r\ll n$ である。この選択は、等価に、カーネル $k(x,y)=\dotp{\varphi を得る。 (x)}{\varphi (y)$、そして、シンクホーン反復のコストが$o(nr)$であることを保証する。この形式を用いて通常のコスト関数を近似できることを示す。さらに,従来提案されていたカーネル行列の適応型低ランク近似とは対照的に,入力分布に対して完全に微分可能な近似が得られ,OT-GAN \cite{salimans2018improving}のより高速な変種を訓練する。

関連論文リスト

Upper Bounds for Learning in Reproducing Kernel Hilbert Spaces for Orbits of an Iterated Function System [1.1510009152620668]
学習理論における重要な問題は、ある入力$x$と対応する出力$y$の関係を近似した関数$f$を計算することである。この近似はサンプル点 $(x_t,y_t)_t=1m$ に基づいている。
論文参考訳（メタデータ） (2024-10-10T20:34:22Z)
Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
線形スケッチを用いた行列とベクトルノルムの残差誤差推定問題について検討する。これは、前作とほぼ同じスケッチサイズと精度で、経験的にかなり有利であることを示す。また、スパースリカバリ問題に対して$Omega(k2/pn1-2/p)$低いバウンダリを示し、これは$mathrmpoly(log n)$ factorまで厳密である。
論文参考訳（メタデータ） (2024-08-16T02:33:07Z)
A General Theory for Kernel Packets: from state space model to compactly supported basis [16.235214685688227]
GP の $m$-dimensional SS モデルの定式化は、一般右 Kernel Packet (KP) として導入する概念と等価であることを示す。 KP は GP 予測時間を $mathcalO(log n)$ または $mathcalO(1)$ に改善し、GP の導関数やカーネル乗算を含むより広範なアプリケーションを可能にし、分散データに対して多次元加法および製品カーネルに一般化することができる。
論文参考訳（メタデータ） (2024-02-06T14:12:46Z)
Learning a Single Neuron with Adversarial Label Noise via Gradient Descent [50.659479930171585]
モノトン活性化に対する $mathbfxmapstosigma(mathbfwcdotmathbfx)$ の関数について検討する。学習者の目標は仮説ベクトル $mathbfw$ that $F(mathbbw)=C, epsilon$ を高い確率で出力することである。
論文参考訳（メタデータ） (2022-06-17T17:55:43Z)
A Law of Robustness beyond Isoperimetry [84.33752026418045]
我々は、任意の分布上でニューラルネットワークパラメータを補間する頑健性の低い$Omega(sqrtn/p)$を証明した。次に、$n=mathrmpoly(d)$のとき、スムーズなデータに対する過度なパラメータ化の利点を示す。我々は、$n=exp(omega(d))$ のとき、$O(1)$-Lipschitz の頑健な補間関数の存在を否定する。
論文参考訳（メタデータ） (2022-02-23T16:10:23Z)
Random matrices in service of ML footprint: ternary random features with no performance loss [55.30329197651178]
我々は、$bf K$ の固有スペクトルが$bf w$ の i.d. 成分の分布とは独立であることを示す。 3次ランダム特徴(TRF)と呼ばれる新しいランダム手法を提案する。提案したランダムな特徴の計算には乗算が不要であり、古典的なランダムな特徴に比べてストレージに$b$のコストがかかる。
論文参考訳（メタデータ） (2021-10-05T09:33:49Z)
Spectral properties of sample covariance matrices arising from random matrices with independent non identically distributed columns [50.053491972003656]
関数 $texttr(AR(z))$, for $R(z) = (frac1nXXT- zI_p)-1$ and $Ain mathcal M_p$ deterministic, have a standard deviation of order $O(|A|_* / sqrt n)$. ここでは、$|mathbb E[R(z)] - tilde R(z)|_F を示す。
論文参考訳（メタデータ） (2021-09-06T14:21:43Z)
Kernel Thinning [26.25415159542831]
カーネルの薄型化は、サンプリングや標準的な薄型化よりも効率的に$mathbbP$を圧縮するための新しい手順である。我々は、ガウス、マタン、およびB-スプライン核に対する明示的な非漸近的な最大誤差境界を導出する。
論文参考訳（メタデータ） (2021-05-12T17:56:42Z)
Two-way kernel matrix puncturing: towards resource-efficient PCA and spectral clustering [43.50783459690612]
この方法は、データマトリックス$XinmathbbCptimes n$と対応するカーネル(Gram)マトリックス$K$の両方をBernoulliマスクを介してランダムに「切断」する。我々は、GAN生成した画像データベースを実証的に確認し、データを劇的にパンクし、巨大な計算とストレージのゲインを提供することができることを確認した。
論文参考訳（メタデータ） (2021-02-24T14:01:58Z)
Convergence of Sparse Variational Inference in Gaussian Processes Regression [29.636483122130027]
計算コストが$mathcalO(log N)2D(log N)2)$の手法を推論に利用できることを示す。
論文参考訳（メタデータ） (2020-08-01T19:23:34Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。