Fugu-MT 論文翻訳(概要): Efficient Retrieval Scaling with Hierarchical Indexing for Large Scale Recommendation

論文の概要: Efficient Retrieval Scaling with Hierarchical Indexing for Large Scale Recommendation

arxiv url: http://arxiv.org/abs/2604.12965v1
Date: Tue, 14 Apr 2026 16:59:03 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-15 19:11:32.572445
Title: Efficient Retrieval Scaling with Hierarchical Indexing for Large Scale Recommendation
Title（参考訳）: 大規模レコメンデーションのための階層インデックスを用いた効率的な検索スケーリング
Authors: Dongqi Fu, Kaushik Rangadurai, Haiyu Lu, Yunchen Pu, Siyang Yuan, Minhui Huang, Yiqun Liu, Golnaz Ghasemiesfeh, Xingfeng He, Fangzhou Xu, Andrew Cui, Vidhoon Viswanathan, Lin Yang, Liang Wang, Jiyan Yang, Chonglin Sun,
Abstract要約: 本稿では,基礎的検索モデルの記憶から階層的な組織を学習できるかどうかを考察する。本稿では,大規模検索モデルに対するクロスアテンションと残差量子化を用いた階層的指標の併用学習を提案する。われわれはMetaで現実の展開を紹介し、FacebookやInstagramの何十億人ものユーザーに対する毎日の広告レコメンデーションをサポートする。
参考スコア（独自算出の注目度）: 18.266884490896228
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The increase in data volume, computational resources, and model parameters during training has led to the development of numerous large-scale industrial retrieval models for recommendation tasks. However, effectively and efficiently deploying these large-scale foundational retrieval models remains a critical challenge that has not been fully addressed. Common quick-win solutions for deploying these massive models include relying on offline computations (such as cached user dictionaries) or distilling large models into smaller ones. Yet, both approaches fall short of fully leveraging the representational and inference capabilities of foundational models. In this paper, we explore whether it is possible to learn a hierarchical organization over the memory of foundational retrieval models. Such a hierarchical structure would enable more efficient search by reducing retrieval costs while preserving exactness. To achieve this, we propose jointly learning a hierarchical index using cross-attention and residual quantization for large-scale retrieval models. We also present its real-world deployment at Meta, supporting daily advertisement recommendations for billions of Facebook and Instagram users. Interestingly, we discovered that the intermediate nodes in the learned index correspond to a small set of high-quality data. Fine-tuning the model on this set further improves inference performance, and concretize the concept of "test-time training" within the recommendation system domain. We demonstrate these findings using both internal and public datasets with strong baseline comparisons and hope they contribute to the community's efforts in developing the next generation of foundational retrieval models.
Abstract（参考訳）: トレーニング中のデータ量、計算資源、モデルパラメータの増加は、レコメンデーションタスクのための大規模産業検索モデルの開発に繋がった。しかし、これらの大規模基盤検索モデルを効果的かつ効率的に展開することは、まだ十分に解決されていない重要な課題である。これらの大規模モデルをデプロイするための一般的なクイックウィンソリューションには、オフラインの計算(キャッシュされたユーザ辞書など)に依存することや、大規模なモデルをより小さなものに蒸留することなどがある。しかし、どちらのアプローチも基礎モデルの表現的および推論能力を完全に活用できない。本稿では,基礎的検索モデルの記憶から階層的な組織を学習できるかどうかを考察する。このような階層構造は、正確性を保ちながら検索コストを削減し、より効率的な探索を可能にする。そこで本研究では,大規模検索モデルに対して,クロスアテンションと残差量子化を用いた階層的指標の同時学習を提案する。またMetaでは、何十億ものFacebookやInstagramユーザーに対する毎日の広告レコメンデーションもサポートしています。興味深いことに、学習指標の中間ノードは、少数の高品質なデータに対応することがわかった。このセットでモデルを微調整することで推論性能が向上し、レコメンデーションシステムドメイン内での"テストタイムトレーニング"の概念が強化される。本研究は,ベースライン比較の強い内部データセットと公開データセットを用いて,これらの知見を実証し,次世代の基盤検索モデル開発へのコミュニティの取り組みに寄与することを期待する。

論文の概要: Efficient Retrieval Scaling with Hierarchical Indexing for Large Scale Recommendation

関連論文リスト