Fugu-MT 論文翻訳(概要): SID-Coord: Coordinating Semantic IDs for ID-based Ranking in Short-Video Search

論文の概要: SID-Coord: Coordinating Semantic IDs for ID-based Ranking in Short-Video Search

arxiv url: http://arxiv.org/abs/2604.10471v1
Date: Sun, 12 Apr 2026 05:51:35 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-14 20:13:16.037009
Title: SID-Coord: Coordinating Semantic IDs for ID-based Ranking in Short-Video Search
Title（参考訳）: SID-Coord:ショートビデオ検索におけるIDベースのランク付けのためのセマンティックIDのコーディネート
Authors: Guowen Li, Yuepeng Zhang, Shunyu Zhang, Yi Zhang, Xiaoze Jiang, Yi Wang, Jingwei Zhuo,
Abstract要約: SID-Coordは、個別のトレーニング可能なセマンティックIDをIDベースのランキングモデルに直接組み込む軽量なセマンティックIDフレームワークである。 SID-Coordは、セマンティックシグナルを補助的な高密度な特徴として扱う代わりに、セマンティックスを構造化識別子として表現する。実世界の生産環境でのオンラインA/B実験は統計的に有意な改善を示した。
参考スコア（独自算出の注目度）: 9.72713305999446
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large-scale short-video search ranking models are typically trained on sparse co-occurrence signals over hashed item identifiers (HIDs). While effective at memorizing frequent interactions, such ID-based models struggle to generalize to long-tailed items with limited exposure. This memorization-generalization trade-off remains a longstanding challenge in such industrial systems. We propose SID-Coord, a lightweight Semantic ID framework that incorporates discrete, trainable semantic IDs (SIDs) directly into ID-based ranking models. Instead of treating semantic signals as auxiliary dense features, SID-Coord represents semantics as structured identifiers and coordinates HID-based memorization with SID-based generalization within a unified modeling framework. To enable effective coordination, SID-Coord introduces three components: (1) an attention-based fusion module over hierarchical SIDs to capture multi-level semantics, (2) a target-aware HID-SID gating mechanism that adaptively balances memorization and generalization, and (3) a SID-driven interest alignment module that models the semantic similarity distribution between target items and user histories. SID-Coord can be integrated into existing production ranking systems without modifying the backbone model. Online A/B experiments in a real-world production environment show statistically significant improvements, with a +0.664% gain in long-play rate in search and a +0.369% increase in search playback duration.
Abstract（参考訳）: 大規模なショートビデオ検索ランキングモデルは通常、ハッシュアイテム識別子(HID)よりも粗い共起信号に基づいて訓練される。頻繁な相互作用を記憶することは効果的であるが、そのようなIDベースのモデルは、露出が限られた長い長い項目に一般化するのに苦労する。この記憶化一般化トレードオフは、そのような産業システムにおける長年にわたる課題である。 SID-Coordは、個別のトレーニング可能なセマンティックID(SID)をIDベースのランキングモデルに直接組み込む軽量セマンティックIDフレームワークである。 SID-Coordは、セマンティック信号を補助的な高密度特徴として扱う代わりに、セマンティックスを構造化識別子として表現し、統一モデリングフレームワーク内でのHIDベースの記憶とSIDベースの一般化をコーディネートする。効果的なコーディネートを実現するため,(1)階層型SID上の注意ベース融合モジュールによるマルチレベルセマンティクスのキャプチャ,(2)記憶と一般化を適応的にバランスさせるターゲット認識型HID-SIDゲーティング機構,(3)ターゲット項目とユーザ履歴間の意味的類似度分布をモデル化するSID駆動型関心アライメントモジュールの3つのコンポーネントを導入している。 SID-Coordは、バックボーンモデルを変更することなく、既存のプロダクションランキングシステムに統合することができる。実世界の生産環境でのオンラインA/B実験は統計的に有意な改善を示し、検索の長期プレイレートは+0.664%、検索期間は+0.369%上昇した。

論文の概要: SID-Coord: Coordinating Semantic IDs for ID-based Ranking in Short-Video Search

関連論文リスト