Fugu-MT 論文翻訳(概要): Self-Normalized Martingales and Uniform Regret Bounds for Linear Regression

論文の概要: Self-Normalized Martingales and Uniform Regret Bounds for Linear Regression

arxiv url: http://arxiv.org/abs/2605.01628v1
Date: Sat, 02 May 2026 22:39:00 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-05 20:33:49.857481
Title: Self-Normalized Martingales and Uniform Regret Bounds for Linear Regression
Title（参考訳）: 線形回帰に対する自己Normalized Martingales と一様回帰境界
Authors: Fan Chen, Jian Qian, Alexander Rakhlin, Nikita Zhivotovskiy,
Abstract要約: 自己正規化マルティンガレのスケール不変上界が可能であることを示す。通常の正規化ペナルティを含まない自己正規化濃度不等式を導出する。
参考スコア（独自算出の注目度）: 65.82017723631897
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Self-normalized martingale inequalities lie at the heart of confidence ellipsoids for online least squares and, more broadly, many bandit and reinforcement-learning results. Yet existing vector and scalar results typically rely on bounded covariates and an explicit regularization matrix, producing bounds that are \emph{not scale-invariant}: although the self-normalized quantity is scale-invariant by definition, its standard upper bounds are not. We characterize when scale-invariant upper bounds on self-normalized martingales are possible. Without further assumptions, we prove that nontrivial scale-invariant bounds exist only in dimension $d=1$; moreover, in $d=1$ we obtain $O(\log T)$ scale-invariant self-normalized bounds without any assumptions on the covariates. In contrast, for $d>1$ we show that no nontrivial scale-invariant bound can hold in full generality. We then connect this dichotomy to \emph{doubly-uniform} regret in online linear regression (i.e., regret bounds that are simultaneously independent of the covariate scale and the comparator norm) and use it to resolve the open question of Gaillard, Gerchinovitz, Huard, and Stoltz, \emph{``Uniform regret bounds over $\mathbb{R}^d$ for the sequential linear regression problem with the square loss''} (ALT 2019): in $d=1$ we give an explicit algorithm with $O(\log T)$ doubly-uniform regret, whereas for $d>1$ sublinear doubly-uniform regret is impossible. Finally, under a natural \emph{smoothness} condition (bounded Radon--Nikodym derivatives of the conditional covariate laws with respect to a fixed base measure), we recover sublinear regret for $d>1$ without bounded covariates and derive a self-normalized concentration inequality free of the usual regularization penalties, yielding arguably a first natural scale-invariant bound for adaptive, non-i.i.d. vector martingales.
Abstract（参考訳）: 自己正規化マーチンゲーレの不等式は、オンラインの最小二乗の信頼楕円体の中心にあり、より広範に、多くのバンディットと強化学習の結果である。しかし、既存のベクトルとスカラーの結果は、典型的には有界な共変量と明示的な正則化行列に頼り、その境界は \emph{not scale-invariant} となる: 自己正規化量は定義によってスケール不変であるが、その標準上界はそうではない。自己正規化マルティンガレのスケール不変上界が可能である場合に特徴付ける。さらなる仮定がなければ、非自明なスケール不変境界は次元$d=1$でしか存在せず、さらに$d=1$では、余変数に関する仮定なしでスケール不変な自己正規化境界を$O(\log T)$で取得する。対照的に、$d>1$ の場合、非自明なスケール不変境界が完全な一般性で成り立たないことを示す。次に、この二分法をオンライン線形回帰における \emph{doubly-uniform} の後悔(すなわち、共変量スケールとコンパレータノルムから同時に独立な後悔境界)に結び付け、ガイヤール、ゲルキノヴィッツ、ヒューラード、ストルツの開問題を解決するためにそれを使う: $\mathbb{R}^d$ 平方損失を伴う逐次線形回帰問題に対する $\mathbb{R}^d$ に対して $d=1$ の明示的アルゴリズムを $O(\log T)$d(\log T)$dubly-uniform の後悔を与える。最後に、自然な 'emph{smoothness} 条件(固定基底測度に関する条件共変法則の有界ラドン-ニコディム微分)の下で、有界共変量のない$d>1$の部分線型後悔を回復し、通常の正則化法則を含まない自己正規化集中不等式を導出し、適応的、非非非等式ベクトルマーチンガレに対して有界な最初の自然スケール不変量を与える。

論文の概要: Self-Normalized Martingales and Uniform Regret Bounds for Linear Regression

関連論文リスト