Fugu-MT 論文翻訳(概要): Optimal Contextual Pricing and Extensions

論文の概要: Optimal Contextual Pricing and Extensions

arxiv url: http://arxiv.org/abs/2003.01703v3
Date: Tue, 23 Feb 2021 23:33:43 GMT
ステータス: 翻訳完了
システム内更新日: 2022-12-26 23:26:54.298470
Title: Optimal Contextual Pricing and Extensions
Title（参考訳）: 最適コンテキスト価格と拡張
Authors: Allen Liu, Renato Paes Leme, Jon Schneider
Abstract要約: このアルゴリズムは$Omega(d log T)$ lower bound up to the $d log d$ additive factor。これらのアルゴリズムは、凸領域のスタイナー固有値を様々なスケールで有界化する新しい手法に基づいている。
参考スコア（独自算出の注目度）: 32.152826900874075
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the contextual pricing problem a seller repeatedly obtains products described by an adversarially chosen feature vector in $\mathbb{R}^d$ and only observes the purchasing decisions of a buyer with a fixed but unknown linear valuation over the products. The regret measures the difference between the revenue the seller could have obtained knowing the buyer valuation and what can be obtained by the learning algorithm. We give a poly-time algorithm for contextual pricing with $O(d \log \log T + d \log d)$ regret which matches the $\Omega(d \log \log T)$ lower bound up to the $d \log d$ additive factor. If we replace pricing loss by the symmetric loss, we obtain an algorithm with nearly optimal regret of $O(d \log d)$ matching the $\Omega(d)$ lower bound up to $\log d$. These algorithms are based on a novel technique of bounding the value of the Steiner polynomial of a convex region at various scales. The Steiner polynomial is a degree $d$ polynomial with intrinsic volumes as the coefficients. We also study a generalized version of contextual search where the hidden linear function over the Euclidean space is replaced by a hidden function $f : \mathcal{X} \rightarrow \mathcal{Y}$ in a certain hypothesis class $\mathcal{H}$. We provide a generic algorithm with $O(d^2)$ regret where $d$ is the covering dimension of this class. This leads in particular to a $\tilde{O}(s^2)$ regret algorithm for linear contextual search if the linear function is guaranteed to be $s$-sparse. Finally we also extend our results to the noisy feedback model, where each round our feedback is flipped with a fixed probability $p < 1/2$.
Abstract（参考訳）: 文脈価格問題において、売り手は、逆選択された特徴ベクトルによって記述された商品を$\mathbb{R}^d$で繰り返し取得し、その商品に対して固定だが未知の線形評価を持つ買い手の購入決定を観察する。この後悔は、売り手が購入者の評価を知るために得た収入と、学習アルゴリズムによって得られるものとの差を測定する。 o(d \log \log t + d \log d)$ regret は$\omega(d \log \log t)$ で、$d \log d$ 加法係数まで下限する。価格損失を対称損失で置き換えるならば、$O(d \log d)$ のほぼ最適な後悔を伴うアルゴリズムが、$\Omega(d)$ の低い境界を $\log d$ に一致する。これらのアルゴリズムは、凸領域のシュタイナー多項式の値を様々なスケールで境界付ける新しい手法に基づいている。シュタイナー多項式は内在体積を係数とする次数$d$多項式である。また、ユークリッド空間上の隠れ線型函数は、ある仮説クラス $\mathcal{H}$ において隠れ関数 $f : \mathcal{X} \rightarrow \mathcal{Y}$ に置き換えられるような一般化された文脈探索についても検討する。我々は、このクラスの被覆次元が$d$である場合、$o(d^2)$ regret のジェネリックアルゴリズムを提供する。これにより、線形文脈探索のための$\tilde{o}(s^2)$ regretアルゴリズムは、線形関数が$s$-sparseであることが保証される。最後に、結果はノイズの多いフィードバックモデルにも拡張し、各ラウンドのフィードバックは固定確率$p < 1/2$で反転します。

関連論文リスト

Learning and Computation of $Φ$-Equilibria at the Frontier of Tractability [85.07238533644636]
$Phi$-equilibriaは、オンライン学習とゲーム理論の中心にある、強力で柔軟なフレームワークだ。効率的なオンラインアルゴリズムは、$textpoly(d, k)/epsilon2$ラウンドを使用して、平均$Phi$-regretを最大$epsilon$で生成することを示す。また、オンライン設定において、ほぼ一致した下限を示し、その結果、$Phi$-regretの学習可能性を取得する偏差の族が初めて得られる。
論文参考訳（メタデータ） (2025-02-25T19:08:26Z)
Batched Stochastic Bandit for Nondegenerate Functions [8.015503209312786]
本稿では,非退化関数に対するバッチ帯域学習問題について検討する。本稿では,非退化関数に対するバッチバンドイット問題をほぼ最適に解くアルゴリズムを提案する。
論文参考訳（メタデータ） (2024-05-09T12:50:16Z)
Algorithms for mean-field variational inference via polyhedral optimization in the Wasserstein space [10.292118864147097]
ワッサーシュタイン空間上の有限次元多面体部分集合の理論を開発し、一階法による函数の最適化を行う。我々の主な応用は平均場変動推論の問題であり、これは分布の$pi$ over $mathbbRd$を製品測度$pistar$で近似しようとするものである。
論文参考訳（メタデータ） (2023-12-05T16:02:04Z)
Learning a Single Neuron with Adversarial Label Noise via Gradient Descent [50.659479930171585]
モノトン活性化に対する $mathbfxmapstosigma(mathbfwcdotmathbfx)$ の関数について検討する。学習者の目標は仮説ベクトル $mathbfw$ that $F(mathbbw)=C, epsilon$ を高い確率で出力することである。
論文参考訳（メタデータ） (2022-06-17T17:55:43Z)
Logarithmic Regret from Sublinear Hints [76.87432703516942]
自然クエリモデルにより,アルゴリズムが$O(log T)$ regretsを$O(sqrtT)$ hintsで得ることを示す。また、$o(sqrtT)$ hintsは$Omega(sqrtT)$ regretより保証できないことも示しています。
論文参考訳（メタデータ） (2021-11-09T16:50:18Z)
Contextual Recommendations and Low-Regret Cutting-Plane Algorithms [49.91214213074933]
本稿では、ナビゲーションエンジンやレコメンデーションシステムにおけるルーティングアプリケーションによって動機付けられた、コンテキスト線形帯域の次の変種について考察する。我々は、真の点$w*$と分離オラクルが返す超平面の間の全距離を、低い「回帰」を持つ新しい切断平面アルゴリズムを設計する。
論文参考訳（メタデータ） (2021-06-09T05:39:05Z)
Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
我々は,バンディットフィードバックを用いてオンライン学習を学習する。 learnerは、コスト/リワード関数が"pseudo-1d"構造を許可するゼロ次オラクルのみにアクセスできる。我々は、$T$がラウンドの数である任意のアルゴリズムの後悔のために$min(sqrtdT、T3/4)$の下限を示しています。ランダム化オンライングラデーション下降とカーネル化指数重み法を組み合わせた新しいアルゴリズムsbcalgを提案し,疑似-1d構造を効果的に活用する。
論文参考訳（メタデータ） (2021-02-15T08:16:51Z)
Thresholded Lasso Bandit [70.17389393497125]
Thresholded Lasso banditは、報酬関数を定義するベクトルとスパースサポートを推定するアルゴリズムである。一般には $mathcalO( log d + sqrtT )$ や $mathcalO( log d + sqrtT )$ としてスケールする非漸近的後悔の上界を確立する。
論文参考訳（メタデータ） (2020-10-22T19:14:37Z)
$Q$-learning with Logarithmic Regret [60.24952657636464]
楽観的な$Q$は$mathcalOleft(fracSAcdot mathrmpolyleft(Hright)Delta_minlogleft(SATright)right)$ cumulative regret bound, where $S$ is the number of state, $A$ is the number of action, $H$ is the planning horizon, $T$ is the total number of steps, $Delta_min$ is the least sub-Optitimality gap。
論文参考訳（メタデータ） (2020-06-16T13:01:33Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。