Fugu-MT 論文翻訳(概要): Learning Multinomial Logits in $O(n \log n)$ time

論文の概要: Learning Multinomial Logits in $O(n \log n)$ time

arxiv url: http://arxiv.org/abs/2601.04423v1
Date: Wed, 07 Jan 2026 22:07:44 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-09 17:01:52.939013
Title: Learning Multinomial Logits in $O(n \log n)$ time
Title（参考訳）: O(n \log n)$ timeにおける多項ロジットの学習
Authors: Flavio Chierichetti, Mirko Giacchini, Ravi Kumar, Silvio Lattanzi, Alessandro Panconesi, Erasmo Tani, Andrew Tomkins,
Abstract要約: MNLモデル(Multinomial Logit、MNL)は、アイテム$[n]=1, ..., n$の有限宇宙から成り、それぞれ正の重みを割り当てる。クエリはslateと呼ばれる許容可能なサブセットを指定し、モデルはそのslateからその重みに比例した確率で1つのアイテムを選択する。このクエリモデルは、文学におけるPockett-Luceモデルまたは条件付きサンプリングオラクルとしても知られている。
参考スコア（独自算出の注目度）: 56.23331174813387
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A Multinomial Logit (MNL) model is composed of a finite universe of items $[n]=\{1,..., n\}$, each assigned a positive weight. A query specifies an admissible subset -- called a slate -- and the model chooses one item from that slate with probability proportional to its weight. This query model is also known as the Plackett-Luce model or conditional sampling oracle in the literature. Although MNLs have been studied extensively, a basic computational question remains open: given query access to slates, how efficiently can we learn weights so that, for every slate, the induced choice distribution is within total variation distance $\varepsilon$ of the ground truth? This question is central to MNL learning and has direct implications for modern recommender system interfaces. We provide two algorithms for this task, one with adaptive queries and one with non-adaptive queries. Each algorithm outputs an MNL $M'$ that induces, for each slate $S$, a distribution $M'_S$ on $S$ that is within $\varepsilon$ total variation distance of the true distribution. Our adaptive algorithm makes $O\left(\frac{n}{\varepsilon^{3}}\log n\right)$ queries, while our non-adaptive algorithm makes $O\left(\frac{n^{2}}{\varepsilon^{3}}\log n \log\frac{n}{\varepsilon}\right)$ queries. Both algorithms query only slates of size two and run in time proportional to their query complexity. We complement these upper bounds with lower bounds of $Ω\left(\frac{n}{\varepsilon^{2}}\log n\right)$ for adaptive queries and $Ω\left(\frac{n^{2}}{\varepsilon^{2}}\log n\right)$ for non-adaptive queries, thus proving that our adaptive algorithm is optimal in its dependence on the support size $n$, while the non-adaptive one is tight within a $\log n$ factor.
Abstract（参考訳）: MNLモデル(Multinomial Logit)は、アイテム$[n]=\{1,..., n\}$の有限宇宙から成り、それぞれ正の重みが割り当てられる。クエリはslateと呼ばれる許容可能なサブセットを指定し、モデルはそのslateからその重みに比例した確率で1つのアイテムを選択する。このクエリモデルは、文学におけるPockett-Luceモデルまたは条件付きサンプリングオラクルとしても知られている。 MNLは広く研究されているが、基本的な計算問題はまだ未解決である: スレートへのクエリアクセスが与えられた場合、どのスレートに対して、誘導された選択分布が基底真実の総変分距離$\varepsilon$に収まるように、どの程度効率的にウェイトを学ぶことができるか? この問題はMNL学習の中心であり、現代のレコメンデータシステムインターフェースに直接的な意味を持つ。このタスクには,適応型クエリと非適応型クエリの2つのアルゴリズムを提供する。各アルゴリズムはMNL$M'$を出力し、各スレート$S$に対して、真分布の総変分距離$\varepsilon$内にある$S$上の分布$M'_S$を誘導する。適応アルゴリズムは$O\left(\frac{n}{\varepsilon^{3}}\log n\right)$クエリを作り、非適応アルゴリズムは$O\left(\frac{n^{2}}{\varepsilon^{3}}\log n \log{n}{\varepsilon}\right)$クエリを作ります。どちらのアルゴリズムもサイズ2のスレートのみをクエリし、クエリの複雑さに比例して実行します。我々はこれらの上限を、適応的クエリに対して$Ω\left(\frac{n}{\varepsilon^{2}}\log n\right)$と、非適応的クエリに対して$Ω\left(\frac{n^{2}}{\varepsilon^{2}}\log n\right)$で補う。

関連論文リスト

Learning Partitions with Optimal Query and Round Complexities [16.815943270621638]
未知の$n$要素を少なくとも$k$集合に分割することの基本的な問題を考える。非適応アルゴリズムには$Theta(n2)$クエリが必要ですが、適応アルゴリズムには$Theta(nk)$クエリが必要です。我々のアルゴリズムは、最適な$O(nk)$クエリ複雑性を達成するために、$O(log log n)$ラウンドしか必要としない。
論文参考訳（メタデータ） (2025-05-08T07:27:29Z)
On the query complexity of sampling from non-log-concave distributions [2.4253233571593547]
密度$p(x)propto e-f(x)$を持つ$d$次元分布からサンプリングする問題を、必ずしも良好な等尺条件を満たすとは限らない。広い範囲のパラメータに対して、サンプリングは$d$の超指数係数による最適化よりも厳密に容易であることを示す。
論文参考訳（メタデータ） (2025-02-10T06:54:16Z)
Active Learning of General Halfspaces: Label Queries vs Membership Queries [35.41111301965335]
アクティブな学習者は、$tildeOmega(d/(log(m)epsilon)$というラベルの複雑さを必要とする。クエリ複雑性が$tildeO(min1/p, 1/epsilon + dcdot polylog (1/epsilon))$$O(opt)+epsilon$のエラー保証を実現する。
論文参考訳（メタデータ） (2024-12-31T15:40:48Z)
Provable Scaling Laws for the Test-Time Compute of Large Language Models [84.00141420901038]
本研究では,大規模言語モデルのテスト時間計算において,証明可能なスケーリング法則を享受する2つのアルゴリズムを提案する。 1つは2段階ノックアウト方式のアルゴリズムで、各候補は複数の相手に対して平均勝利率で評価される。もう1つは2段階のリーグ方式のアルゴリズムで、各候補は複数の相手に対して平均勝利率で評価される。
論文参考訳（メタデータ） (2024-11-29T05:29:47Z)
Clustering with Non-adaptive Subset Queries [16.662507957069813]
クエリ $S の部分集合 U$, $|S|=2$ が与えられたとき、オラクルは、ポイントが同じクラスタにあり、そうでなければ、イエスを返す。ペアワイズクエリを用いた適応アルゴリズムでは、必要なクエリの数は$Theta(nk)$であることが知られている。非適応スキームは$Omega(n2)$クエリを必要とするが、これは全ての点を問合せすることで得られる自明な$O(n2)$上限と一致する。
論文参考訳（メタデータ） (2024-09-17T05:56:07Z)
Near-Optimal Bounds for Learning Gaussian Halfspaces with Random Classification Noise [50.64137465792738]
この問題に対する効率的なSQアルゴリズムは、少なくとも$Omega(d1/2/(maxp, epsilon)2)$. のサンプル複雑性を必要とする。我々の下限は、この1/epsilon$に対する二次的依存は、効率的なアルゴリズムに固有のものであることを示唆している。
論文参考訳（メタデータ） (2023-07-13T18:59:28Z)
Online Lewis Weight Sampling [62.38157566916501]
コーエンとペンはルイスの重量サンプリングを理論計算機科学コミュニティに導入した。この重要なプリミティブを、オンラインコアセット、スライディングウィンドウ、対向ストリーミングモデルなど、他の設定に拡張した作品もいくつかある。オンラインコアセット,スライディングウィンドウ,および逆ストリーミングモデルにおいて,すべての$pin(0,infty)$に対して,ほぼ最適に近い$ell_p$サブスペース埋め込みを設計する。
論文参考訳（メタデータ） (2022-07-17T19:40:51Z)
Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$ Geometry [69.24618367447101]
対数要因まで $(varepsilon,delta)$-differently private の最適過剰人口損失は $sqrtlog(d)/n + sqrtd/varepsilon n.$ です。損失関数がさらなる滑らかさの仮定を満たすとき、余剰損失は$sqrtlog(d)/n + (log(d)/varepsilon n)2/3で上界(対数因子まで)であることが示される。
論文参考訳（メタデータ） (2021-03-02T06:53:44Z)
Streaming Complexity of SVMs [110.63976030971106]
本稿では,ストリーミングモデルにおけるバイアス正規化SVM問題を解く際の空間複雑性について検討する。両方の問題に対して、$frac1lambdaepsilon$の次元に対して、$frac1lambdaepsilon$よりも空間的に小さいストリーミングアルゴリズムを得ることができることを示す。
論文参考訳（メタデータ） (2020-07-07T17:10:00Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。