Fugu-MT 論文翻訳(概要): Large Scale Retrieval for the LinkedIn Feed using Causal Language Models

論文の概要: Large Scale Retrieval for the LinkedIn Feed using Causal Language Models

arxiv url: http://arxiv.org/abs/2510.14223v1
Date: Thu, 16 Oct 2025 02:01:33 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-17 21:15:14.677175
Title: Large Scale Retrieval for the LinkedIn Feed using Causal Language Models
Title（参考訳）: 因果言語モデルを用いたLinkedInフィードの大規模検索
Authors: Sudarshan Srinivasa Ramanujam, Antonio Alonso, Saurabh Kataria, Siddharth Dangi, Akhilesh Gupta, Birjodh Singh Tiwana, Manas Somaiya, Luke Simon, David Byrne, Sojeong Ha, Sen Zhou, Andrei Akterskii, Zhanglong Liu, Samira Sriram, Crescent Xiong, Zhoutao Pei, Angela Shao, Alex Li, Annie Xiao, Caitlin Kolb, Thomas Kistler, Zach Moore, Hamed Firooz,
Abstract要約: 本稿では,大規模な因果言語モデルをデュアルエンコーダとして微調整し,ユーザ(メンバー)とコンテンツ(イテム)の両方に高品質な埋め込みを生成する,新たな検索手法を提案する。組込み生成の即時設計、LinkedInのスケールでの微調整のテクニック、低レイテンシのためのインフラストラクチャ、コスト効率のよいオンラインサービスを含む、エンドツーエンドパイプラインについて説明する。
参考スコア（独自算出の注目度）: 4.742257164636025
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In large scale recommendation systems like the LinkedIn Feed, the retrieval stage is critical for narrowing hundreds of millions of potential candidates to a manageable subset for ranking. LinkedIn's Feed serves suggested content from outside of the member's network (based on the member's topical interests), where 2000 candidates are retrieved from a pool of hundreds of millions candidate with a latency budget of a few milliseconds and inbound QPS of several thousand per second. This paper presents a novel retrieval approach that fine-tunes a large causal language model (Meta's LLaMA 3) as a dual encoder to generate high quality embeddings for both users (members) and content (items), using only textual input. We describe the end to end pipeline, including prompt design for embedding generation, techniques for fine-tuning at LinkedIn's scale, and infrastructure for low latency, cost effective online serving. We share our findings on how quantizing numerical features in the prompt enables the information to get properly encoded in the embedding, facilitating greater alignment between the retrieval and ranking layer. The system was evaluated using offline metrics and an online A/B test, which showed substantial improvements in member engagement. We observed significant gains among newer members, who often lack strong network connections, indicating that high-quality suggested content aids retention. This work demonstrates how generative language models can be effectively adapted for real time, high throughput retrieval in industrial applications.
Abstract（参考訳）: LinkedIn Feedのような大規模レコメンデーションシステムでは、検索段階は、管理可能なサブセットに数億の候補を絞り込むために重要である。 LinkedInのフィードは、メンバーのネットワークの外(メンバーのトピックの関心に基づく)からのコンテンツを提供しており、2000人の候補者は、数ミリ秒のレイテンシ予算と毎秒数千のインバウンドQPSを持つ数億の候補のプールから検索される。本稿では,大規模な因果言語モデル(MetaのLLaMA3)をデュアルエンコーダとして微調整し,テキスト入力のみを用いてユーザ(メンバー)とコンテンツ(イテム)の両方に高品質な埋め込みを生成する,新たな検索手法を提案する。組込み生成の即時設計、LinkedInのスケールでの微調整のテクニック、低レイテンシのためのインフラストラクチャ、コスト効率のよいオンラインサービスを含む、エンドツーエンドパイプラインについて説明する。数値的な特徴をインタプリタに量子化することで,情報を埋め込みに適切にエンコードし,検索層とランキング層との整合性を高めることができることを示す。このシステムはオフラインのメトリクスとオンラインのA/Bテストを用いて評価され、メンバーエンゲージメントを大幅に改善した。ネットワーク接続が不十分な新メンバーの間では,高品質なコンテンツが保持に役立つことが示唆された。本研究は,産業アプリケーションにおいて,生成言語モデルをリアルタイムに,高スループット検索に効果的に適用する方法を実証する。

論文の概要: Large Scale Retrieval for the LinkedIn Feed using Causal Language Models

関連論文リスト