Fugu-MT 論文翻訳(概要): Dual-Rerank: Fusing Causality and Utility for Industrial Generative Reranking

論文の概要: Dual-Rerank: Fusing Causality and Utility for Industrial Generative Reranking

arxiv url: http://arxiv.org/abs/2604.07420v1
Date: Wed, 08 Apr 2026 14:54:10 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-10 18:34:05.466829
Title: Dual-Rerank: Fusing Causality and Utility for Industrial Generative Reranking
Title（参考訳）: Dual-Rerank: 産業用ジェネレーティブリグレードの因果性と実用性
Authors: Chao Zhang, Shuai Lin, ChengLei Dai, Ye Qian, Fan Mingyang, Yi Zhang, Yi Wang, Jingwei Zhuo,
Abstract要約: Kuaishouは毎日4億人のアクティブユーザーを提供し、毎日何億もの検索クエリを処理している。最終決定層として、ページ全体のユーティリティを最適化してユーザエクスペリエンスを決定する。本稿では,産業再編を目的とした統合フレームワークであるDual-Rerankを提案する。
参考スコア（独自算出の注目度）: 11.52944506792799
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Kuaishou serves over 400 million daily active users, processing hundreds of millions of search queries daily against a repository of tens of billions of short videos. As the final decision layer, the reranking stage determines user experience by optimizing whole-page utility. While traditional score-and-sort methods fail to capture combinatorial dependencies, Generative Reranking offers a superior paradigm by directly modeling the permutation probability. However, deploying Generative Reranking in such a high-stakes environment faces a fundamental dual dilemma: 1) the structural trade-off where Autoregressive (AR) models offer superior Sequential modeling but suffer from prohibitive latency, versus Non-Autoregressive (NAR) models that enable efficiency but lack dependency capturing; 2) the optimization gap where Supervised Learning faces challenges in directly optimizing whole-page utility, while Reinforcement Learning (RL) struggles with instability in high-throughput data streams. To resolve this, we propose Dual-Rerank, a unified framework designed for industrial reranking that bridges the structural gap via Sequential Knowledge Distillation and addresses the optimization gap using List-wise Decoupled Reranking Optimization (LDRO) for stable online RL. Extensive A/B testing on production traffic demonstrates that Dual-Rerank achieves State-of-the-Art performance, significantly improving User satisfaction and Watch Time while drastically reducing inference latency compared to AR baselines.
Abstract（参考訳）: Kuaishouは毎日4億人のアクティブユーザーを提供し、毎日何十億もの検索クエリを処理している。最終決定層として、ページ全体のユーティリティを最適化してユーザエクスペリエンスを決定する。従来のスコア・アンド・ソート法は組合せ依存を捉えるのに失敗するが、ジェネレーティブ・リグレードは置換確率を直接モデル化することで優れたパラダイムを提供する。しかし、このようなハイテイクな環境でジェネレーティブリグレードをデプロイすることは、基本的な二重ジレンマに直面します。 1)Autoregressive(AR)モデルが優れたシーケンスモデリングを提供するが、非Autoregressive(NAR)モデルに対して、非Autoregressive(NAR)モデルでは効率性はあるものの依存性の捕捉が欠如している構造的トレードオフ。 2) Supervised Learningがページ全体のユーティリティを直接最適化する上で問題となる最適化のギャップに対して,Reinforcement Learning(RL)は高スループットデータストリームの不安定性に悩まされている。この問題を解決するために、Dual-Rerankを提案する。Dual-Rerankは、産業的リグレード用に設計され、シーケンシャル知識蒸留を介して構造的ギャップをブリッジし、安定したオンラインRLのためのリストワイドデカップリング・リサイクリング最適化(LDRO)を用いて最適化ギャップに対処する。運用トラフィック上での大規模なA/Bテストは、Dual-RerankがState-of-the-Artのパフォーマンスを実現し、ユーザ満足度とウォッチタイムを大幅に改善し、ARベースラインに比べて推論レイテンシを大幅に削減したことを示している。

論文の概要: Dual-Rerank: Fusing Causality and Utility for Industrial Generative Reranking

関連論文リスト