Fugu-MT 論文翻訳(概要): Implicit bias of any algorithm: bounding bias via margin

論文の概要: Implicit bias of any algorithm: bounding bias via margin

arxiv url: http://arxiv.org/abs/2011.06550v4
Date: Mon, 23 Nov 2020 17:12:06 GMT
ステータス: 翻訳完了
システム内更新日: 2022-09-26 06:42:09.765598
Title: Implicit bias of any algorithm: bounding bias via margin
Title（参考訳）: 任意のアルゴリズムの帰属バイアス:マージンによる有界バイアス
Authors: Elvis Dohmatob
Abstract要約: マージン関数 $gamma$ は非滑らかなクルディカ・ロジャシエヴィチ不等式指数が 1/2$ を満たすことを証明している。私たちの研究は、そのマージンの観点からアルゴリズムの暗黙のバイアスを分析する一般的なツールを提供します。
参考スコア（独自算出の注目度）: 13.247278149124757
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Consider $n$ points $x_1,\ldots,x_n$ in finite-dimensional euclidean space, each having one of two colors. Suppose there exists a separating hyperplane (identified with its unit normal vector $w)$ for the points, i.e a hyperplane such that points of same color lie on the same side of the hyperplane. We measure the quality of such a hyperplane by its margin $\gamma(w)$, defined as minimum distance between any of the points $x_i$ and the hyperplane. In this paper, we prove that the margin function $\gamma$ satisfies a nonsmooth Kurdyka-Lojasiewicz inequality with exponent $1/2$. This result has far-reaching consequences. For example, let $\gamma^{opt}$ be the maximum possible margin for the problem and let $w^{opt}$ be the parameter for the hyperplane which attains this value. Given any other separating hyperplane with parameter $w$, let $d(w):=\|w-w^{opt}\|$ be the euclidean distance between $w$ and $w^{opt}$, also called the bias of $w$. From the previous KL-inequality, we deduce that $(\gamma^{opt}-\gamma(w)) / R \le d(w) \le 2\sqrt{(\gamma^{opt}-\gamma(w))/\gamma^{opt}}$, where $R:=\max_i \|x_i\|$ is the maximum distance of the points $x_i$ from the origin. Consequently, for any optimization algorithm (gradient-descent or not), the bias of the iterates converges at least as fast as the square-root of the rate of their convergence of the margin. Thus, our work provides a generic tool for analyzing the implicit bias of any algorithm in terms of its margin, in situations where a specialized analysis might not be available: it is sufficient to establish a good rate for converge of the margin, a task which is usually much easier.
Abstract（参考訳）: 有限次元ユークリッド空間において、$n$の点 $x_1,\ldots,x_n$ を考える。点に対して分離超平面(単位正規ベクトル $w)$ が存在し、すなわち同じ色の点が超平面の同じ側にあるような超平面が存在すると仮定する。そのような超平面の質をマージン $\gamma(w)$ で測定し、任意の点 $x_i$ と超平面の間の最小距離として定義する。本稿では, マージン関数 $\gamma$ が非滑らかなkurdyka-lojasiewicz不等式を満たすことを証明した。この結果は遥かに大きな結果をもたらす。例えば、$\gamma^{opt}$ を問題の最大マージンとし、$w^{opt}$ をこの値を達成する超平面のパラメータとする。パラメータ $w$ を持つ他の分離超平面が与えられたとき、$d(w):=\|w-w^{opt}\|$ を $w$ と $w^{opt}$ の間のユークリッド距離とする。以前の KL-不等式から、$(\gamma^{opt}-\gamma(w)) / R \le d(w) \le 2\sqrt{(\gamma^{opt}-\gamma(w))/\gamma^{opt}}$, ここで、$R:=\max_i \|x_i\|$ は原点からの点 $x_i$ の最大距離である。したがって、任意の最適化アルゴリズム(漸近的か否かにかかわらず)において、繰り返しの偏りは、マージンの収束率の平方根の少なくとも1倍の速さで収束する。したがって、本研究は、特別な解析が利用できない状況において、マージンの観点から任意のアルゴリズムの暗黙のバイアスを分析するための汎用的なツールを提供する:マージンの収束率を確立するのに十分である。

関連論文リスト

Sharp Gap-Dependent Variance-Aware Regret Bounds for Tabular MDPs [54.28273395444243]
我々は,モノトニック値 Omega (MVP) アルゴリズムが,差分を考慮した差分依存残差境界を$tildeOleft(left(sum_Delta_h(s,a)>0 fracH2 log K land MathttVar_maxtextc$。
論文参考訳（メタデータ） (2025-06-06T20:33:57Z)
The Communication Complexity of Approximating Matrix Rank [50.6867896228563]
この問題は通信複雑性のランダム化を$Omega(frac1kcdot n2log|mathbbF|)$とする。アプリケーションとして、$k$パスを持つ任意のストリーミングアルゴリズムに対して、$Omega(frac1kcdot n2log|mathbbF|)$スペースローバウンドを得る。
論文参考訳（メタデータ） (2024-10-26T06:21:42Z)
Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
線形スケッチを用いた行列とベクトルノルムの残差誤差推定問題について検討する。これは、前作とほぼ同じスケッチサイズと精度で、経験的にかなり有利であることを示す。また、スパースリカバリ問題に対して$Omega(k2/pn1-2/p)$低いバウンダリを示し、これは$mathrmpoly(log n)$ factorまで厳密である。
論文参考訳（メタデータ） (2024-08-16T02:33:07Z)
Partially Unitary Learning [0.0]
ヒルベルト空間の最適写像 $IN$ of $left|psirightrangle$ と $OUT$ of $left|phirightrangle$ が提示される。この最適化問題の大域的な最大値を求める反復アルゴリズムを開発した。
論文参考訳（メタデータ） (2024-05-16T17:13:55Z)
Parameterized Approximation for Robust Clustering in Discrete Geometric Spaces [2.687607197645453]
次元$Theta(log n)$ が $(sqrt3/2-o(1))$hard である場合でさえ、FPTアルゴリズムを近似する。また、次元 $Theta(log n)$ が $(sqrt3/2-o(1))$hard であるような特別な場合でさえ、FPTアルゴリズムを近似することを示す。
論文参考訳（メタデータ） (2023-05-12T08:43:28Z)
On the Complexity of Decentralized Smooth Nonconvex Finite-Sum Optimization [21.334985032433778]
分散最適化問題 $min_bf xinmathbb Rd f(bf x)triq frac1msum_i=1m f_i(bf x)triq frac1nsum_j=1n。
論文参考訳（メタデータ） (2022-10-25T11:37:11Z)
Optimal Gradient Sliding and its Application to Distributed Optimization Under Similarity [121.83085611327654]
積 $r:=p + q$, ここで$r$は$mu$-strong convex類似性である。エージェントの通信やローカルコールにマスターされた問題を解決する方法を提案する。提案手法は$mathcalO(sqrtL_q/mu)$法よりもはるかにシャープである。
論文参考訳（メタデータ） (2022-05-30T14:28:02Z)
On the Self-Penalization Phenomenon in Feature Selection [69.16452769334367]
カーネル群に基づく暗黙の空間性誘導機構について述べる。アプリケーションとしては、この疎結合誘導機構を使用して、特徴選択に一貫性のあるアルゴリズムを構築します。
論文参考訳（メタデータ） (2021-10-12T09:36:41Z)
Nonconvex-Nonconcave Min-Max Optimization with a Small Maximization Domain [11.562923882714093]
Y f(x,y) における max_y の $min_x 形式の最適化問題における近似一階定常点の探索問題について検討する。我々のアプローチは、関数 $f(x,cdot)$ を $k 次テイラー近似($y$ で)に置き換え、ほぼ定常点を $Y$ で見つけることに依存する。
論文参考訳（メタデータ） (2021-10-08T07:46:18Z)
Spectral properties of sample covariance matrices arising from random matrices with independent non identically distributed columns [50.053491972003656]
関数 $texttr(AR(z))$, for $R(z) = (frac1nXXT- zI_p)-1$ and $Ain mathcal M_p$ deterministic, have a standard deviation of order $O(|A|_* / sqrt n)$. ここでは、$|mathbb E[R(z)] - tilde R(z)|_F を示す。
論文参考訳（メタデータ） (2021-09-06T14:21:43Z)
Minimax Optimal Regression over Sobolev Spaces via Laplacian Regularization on Neighborhood Graphs [25.597646488273558]
非パラメトリック回帰に対するグラフに基づくアプローチであるラプラシア平滑化の統計的性質について検討する。ラプラシアン滑らか化が多様体適応であることを証明する。
論文参考訳（メタデータ） (2021-06-03T01:20:41Z)
Convergence of Graph Laplacian with kNN Self-tuned Kernels [14.645468999921961]
自己チューニングされたカーネルは、各点に$sigma_i$ を $k$-nearest neighbor (kNN) 距離で適応的に設定する。本稿では、グラフラプラシアン作用素$L_N$を、kNN自己チューニングカーネルの新しい族に対する多様体(重み付き)ラプラシアンに収束することを証明する。
論文参考訳（メタデータ） (2020-11-03T04:55:33Z)
Maximizing Determinants under Matroid Constraints [69.25768526213689]
我々は、$det(sum_i in Sv_i v_i v_itop)$が最大になるような基底を$S$$$$M$とする問題を研究する。この問題は、実験的なデザイン、商品の公平な割り当て、ネットワーク設計、機械学習など、さまざまな分野に現れている。
論文参考訳（メタデータ） (2020-04-16T19:16:38Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。