Fugu-MT 論文翻訳(概要): Drift-Aware Continual Tokenization for Generative Recommendation

論文の概要: Drift-Aware Continual Tokenization for Generative Recommendation

arxiv url: http://arxiv.org/abs/2603.29705v1
Date: Tue, 31 Mar 2026 13:02:47 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-01 15:25:03.671638
Title: Drift-Aware Continual Tokenization for Generative Recommendation
Title（参考訳）: ジェネレーションレコメンデーションのためのドリフトアウェア連続トークン化
Authors: Yuebo Feng, Jiahao Liu, Mingzhe Han, Dongsheng Li, Hansu Gu, Peng Zhang, Tun Lu, Ning Gu,
Abstract要約: 生成レコメンデーションは通常、2段階のパイプラインを採用し、学習可能なトークンエーザがアイテムを個別のトークンシーケンスにマップする。最近のトークン化器は、類似のユーザビヘイビアパターンを持つアイテムが同様のコードを受け取るように、協調的な信号も取り入れている。新しいアイテムは衝突やシフトを引き起こし、新しいインタラクションは既存のアイテムの協調的なドリフトを引き起こす。 DACT, Drift-Aware Continual Tokenization framework with two stage: tokenizer fine-tuning and item-level drift confidence。
参考スコア（独自算出の注目度）: 44.13013639783267
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative recommendation commonly adopts a two-stage pipeline in which a learnable tokenizer maps items to discrete token sequences (i.e. identifiers) and an autoregressive generative recommender model (GRM) performs prediction based on these identifiers. Recent tokenizers further incorporate collaborative signals so that items with similar user-behavior patterns receive similar codes, substantially improving recommendation quality. However, real-world environments evolve continuously: new items cause identifier collision and shifts, while new interactions induce collaborative drift in existing items (e.g., changing co-occurrence patterns and popularity). Fully retraining both tokenizer and GRM is often prohibitively expensive, yet naively fine-tuning the tokenizer can alter token sequences for the majority of existing items, undermining the GRM's learned token-embedding alignment. To balance plasticity and stability for collaborative tokenizers, we propose DACT, a Drift-Aware Continual Tokenization framework with two stages: (i) tokenizer fine-tuning, augmented with a jointly trained Collaborative Drift Identification Module (CDIM) that outputs item-level drift confidence and enables differentiated optimization for drifting and stationary items; and (ii) hierarchical code reassignment using a relaxed-to-strict strategy to update token sequences while limiting unnecessary changes. Experiments on three real-world datasets with two representative GRMs show that DACT consistently achieves better performance than baselines, demonstrating effective adaptation to collaborative evolution with reduced disruption to prior knowledge. Our implementation is publicly available at https://github.com/HomesAmaranta/DACT for reproducibility.
Abstract（参考訳）: 生成レコメンデーションは一般的に2段階のパイプラインを採用しており、学習可能なトークンエーザはアイテムを個別のトークンシーケンス(識別子)にマッピングし、自動回帰生成レコメンデーションモデル(GRM)はこれらの識別子に基づいて予測を行う。近年のトークン化ツールでは、類似のユーザビヘイビアパターンを持つアイテムが類似のコードを受け取り、推奨品質を大幅に向上するように、協調的なシグナルが組み込まれている。新しいアイテムは識別子の衝突やシフトを引き起こし、新しいインタラクションは既存のアイテム(例えば、共起パターンや人気の変化)の協調的なドリフトを引き起こす。トークンライザとGRMの両方を完全に再トレーニングすることは、しばしば違法にコストがかかるが、トークンライザを微調整することで、既存のほとんどのアイテムのトークンシーケンスを変更でき、GRMの学習されたトークン埋め込みアライメントを損なう。 DACT(Drift-Aware Continual Tokenization framework)を2段階に分けて提案する。一トークン化装置の微調整、共同で訓練された協調ドリフト識別モジュール(CDIM)により、アイテムレベルのドリフト信頼性を出力し、ドリフト及び静止アイテムの差別化最適化を可能にする。 (II)不必要な変更を制限しながらトークンシーケンスを更新するための緩和された制限付き戦略を用いた階層的コード再割り当て。 2つのGAMを用いた実世界の3つのデータセットの実験により、DACTはベースラインよりも一貫して優れた性能を達成し、協調進化への効果的な適応を実証し、事前知識の破壊を減らした。私たちの実装は再現性のためにhttps://github.com/HomesAmaranta/DACTで公開されています。

論文の概要: Drift-Aware Continual Tokenization for Generative Recommendation

関連論文リスト