Fugu-MT 論文翻訳(概要): On the Practice of Scaling Search Conversion Rate Prediction

論文の概要: On the Practice of Scaling Search Conversion Rate Prediction

arxiv url: http://arxiv.org/abs/2605.29232v1
Date: Thu, 28 May 2026 01:48:19 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-30 00:00:30.938314
Title: On the Practice of Scaling Search Conversion Rate Prediction
Title（参考訳）: 探索変換率予測のスケーリングの実践について
Authors: James Pak, Jyun-Yu Jiang, Fan Zhang, Sen Wang, Taekmin Kim, Henry Tsai, Vijay Rajaram, Juexin Lin, Mohitdeep Singh, Alessandro Magnani, Johnny Chen, Qian Zhao, Rao Fu, Zhirong Liang, Jordan Gilliland, Winter Jiao,
Abstract要約: 検索変換率(CVR)予測モデルのスケーリングは,特に高交通環境において課題となる。本稿では,最新の検索CVR予測モデルを拡張するための効果的なアプローチについて述べる。
参考スコア（独自算出の注目度）: 45.350775275846125
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Scaling a Search Conversion Rate (CVR) prediction model, especially in high-traffic environments, presents a challenge: superior model quality needs to be balanced with strict constraints on training cost and serving latency. This paper details an effective approach for scaling modern search CVR prediction models. We begin with an empirical study to understand the scaling performance of search CVR models, analyzing how quality improves as we scale three key factors of model backbone computation, the size of embedding parameters, and the volume of training data. We use a large-scale production dataset, comprising over a year of customer interaction logs from a high-traffic e-commerce platform, to evaluate the scalability of several state-of-the-art architectures and their ensembles. Our key findings are: (1) selecting the right backbone and scaling factors is crucial; (2) the impact of scaling backbone, embedding, and data is largely independent and additive, which has implications for more efficient scaling exploration; (3) a streamlined warmstart strategy can accelerate training iterations while simplifying new updates; (4) inference optimization strategies such as decoupled graph execution and dynamic batching can enable low-latency GPU serving even for high-capacity models. Compared to a baseline of a pre-scaling production model, we ultimately deployed a model trained on 2.5x larger training data with 8x more inference compute while having minimal latency impact. Online A/B tests also demonstrate that our launches achieved a combined +2.6% gain in a key metric of search conversion rate.
Abstract（参考訳）: 検索変換率(CVR)予測モデルのスケーリング、特に高トラフィック環境では、優れたモデル品質とトレーニングコストとサービスレイテンシの厳格な制約とのバランスが求められる。本稿では,最新の検索CVR予測モデルを拡張するための効果的なアプローチについて述べる。探索CVRモデルのスケーリング性能を理解するための実証的研究から始まり、モデルバックボーン計算の3つの重要な要素、埋め込みパラメータのサイズ、トレーニングデータのボリュームをスケールすることで、品質がいかに向上するかを分析する。高速なeコマースプラットフォームから1年以上の顧客インタラクションログで構成された大規模生産データセットを使用して、最先端アーキテクチャとそのアンサンブルのスケーラビリティを評価します。主な発見は,(1) 右のバックボーンの選択とスケーリングファクタの選択が重要であること,(2) バックボーン,埋め込み,データのスケーリングの影響が,より効率的なスケーリング探索に影響を及ぼすこと,(3) 合理化されたウォームスタート戦略は,新たな更新を簡素化しながらトレーニングイテレーションを加速することができること,(4) 分離グラフ実行や動的バッチ処理などの推論最適化戦略により,高容量モデルでも低レイテンシGPUが利用できること,などである。プリスケーリング生産モデルのベースラインと比較して、最終的に、レイテンシへの影響を最小限に抑えながら、8倍の推論計算で2.5倍のトレーニングデータに基づいてトレーニングされたモデルをデプロイした。オンラインA/Bテストは、我々の打ち上げが検索変換率の重要な指標で+2.6%の上昇を達成したことも示している。

論文の概要: On the Practice of Scaling Search Conversion Rate Prediction

関連論文リスト