Fugu-MT 論文翻訳(概要): Harnessing Unimodality in Semiparametric Contextual Pricing via Oracle Price Map Learning

論文の概要: Harnessing Unimodality in Semiparametric Contextual Pricing via Oracle Price Map Learning

arxiv url: http://arxiv.org/abs/2605.15411v1
Date: Thu, 14 May 2026 20:53:23 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-18 21:22:26.100189
Title: Harnessing Unimodality in Semiparametric Contextual Pricing via Oracle Price Map Learning
Title（参考訳）: オラクルプライスマップ学習による半パラメトリック文脈価格の一様性
Authors: Yingying Fan, Yuxuan Han, Jinchi Lv, Xiaocong Xu, Zhengyuan Zhou,
Abstract要約: 半パラメトリックスカラー・インデックス評価モデルにおいて、潜在値が $v_t_ast(mathsf c_t)+_t$ である場合の文脈力学について検討する。主要な決定対象は、スカラーインデックス$u=_ast(mathsf c)$とノイズテールによって誘導されるオラクル価格マップ$umapsto past(u)$である。我々は、スカラーインデックスを入力とし、ベンチマークをローカライズするモジュラー粗大なポリシーである$mathsfORBIT$を通じてそのような構造を利用する。
参考スコア（独自算出の注目度）: 22.257005185551378
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study contextual dynamic pricing in a semiparametric scalar-index valuation model where the latent value is $v_t=μ_\ast(\mathsf c_t)+ξ_t$, with an unknown utility map $μ_\ast$ and an unknown additive noise distribution. The key decision object is the one-dimensional oracle price map $u\mapsto p^\ast(u)$ induced by the scalar index $u=μ_\ast(\mathsf c)$ and the noise tail. Under the $β$-Hölder smoothness of the tail function for $β\geq 2$ and a revenue-geometry condition that gives a unique, stable, interior maximizer, this oracle map is itself $(β-1)$-smooth. We exploit such structure through $\mathsf{ORBIT}$, a modular coarse-to-fine policy that takes a scalar pilot index as input, localizes a benchmark price in each active bin, and learns a local polynomial approximation of the oracle map inside a trust region via bandit convex optimization. For the baseline linear utility model $μ_\ast(\mathsf c)=\mathsf c^\topθ_\ast$, an adaptive elliptical exploration scheme constructs the required scalar pilot online without distributional assumptions on the contexts. The resulting policy achieves regret $\widetilde{O}\big(T^{\frac{2β-1}{4β-3}}+\sqrt{dT}\big)$. For fixed $d$, we establish a matching lower bound in the horizon dependence, unveiling that the nonparametric oracle-map learning term is minimax sharp. The same scalar-pilot interface also yields extensions to sparse high-dimensional linear utility and nonparametric Hölder utility.
Abstract（参考訳）: 半パラメトリックスカラー・インデックス・アセスメントモデルにおいて、潜在値が$v_t=μ_\ast(\mathsf c_t)+\_t$であり、未知のユーティリティマップ$μ_\ast$と未知の付加雑音分布を持つコンテキスト動的価格について検討する。鍵となる決定対象は、1次元のオラクル価格マップ $u\mapsto p^\ast(u)$ であり、スカラー指数 $u=μ_\ast(\mathsf c)$ とノイズテールによって誘導される。尾関数の$β$-ヘルダー滑らかさと、一意で安定で内部の最大値を与える収益幾何学条件の下では、このオラクル写像はそれ自身$(β-1)$-smoothである。このような構造を$\mathsf{ORBIT}$で利用し、スカラーパイロットインデックスを入力とし、各アクティブビンにベンチマーク価格をローカライズし、ビジット凸最適化により信頼領域内のオラクルマップの局所多項式近似を学習する。ベースライン線形ユーティリティモデル $μ_\ast(\mathsf c)=\mathsf c^\topθ_\ast$ に対して、適応楕円探索スキームは、コンテキスト上の分布仮定なしで、必要なスカラーパイロットをオンラインで構築する。結果として得られるポリシーは、後悔の$\widetilde{O}\big(T^{\frac{2β-1}{4β-3}}+\sqrt{dT}\big)$である。固定$d$の場合、地平線依存の一致した下限を確立し、非パラメトリックオラクルマップ学習項が極小シャープであることを明らかにする。同じスカラー・パイロットインタフェースは、スパース高次元線型効用と非パラメトリック・ヘルダー効用にも拡張をもたらす。

論文の概要: Harnessing Unimodality in Semiparametric Contextual Pricing via Oracle Price Map Learning

関連論文リスト