Fugu-MT 論文翻訳(概要): Self-Supervised Representation Learning with ID-Content Modality Alignment for Sequential Recommendation

論文の概要: Self-Supervised Representation Learning with ID-Content Modality Alignment for Sequential Recommendation

arxiv url: http://arxiv.org/abs/2510.10556v1
Date: Sun, 12 Oct 2025 11:42:49 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-14 18:06:30.006078
Title: Self-Supervised Representation Learning with ID-Content Modality Alignment for Sequential Recommendation
Title（参考訳）: ID-Content Modality Alignment を用いたSequential Recommendationのための自己教師付き表現学習
Authors: Donglin Zhou, Weike Pan, Zhong Ming,
Abstract要約: SICSRec というID-Content モダリティアライメントを用いた自己教師型表現学習モデルを提案する。本稿では,ID-モダリティシーケンスエンコーダがユーザの行動嗜好を捉え,コンテンツ-モダリティシーケンスエンコーダがユーザのコンテンツ嗜好を学習し,ミックス-モダリティシーケンスデコーダがこれらの2種類の嗜好の本質的な関係を把握できる,トランスフォーマーベースのシーケンシャルモデルを設計する。
参考スコア（独自算出の注目度）: 12.356634584441254
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sequential recommendation (SR) models often capture user preferences based on the historically interacted item IDs, which usually obtain sub-optimal performance when the interaction history is limited. Content-based sequential recommendation has recently emerged as a promising direction that exploits items' textual and visual features to enhance preference learning. However, there are still three key challenges: (i) how to reduce the semantic gap between different content modality representations; (ii) how to jointly model user behavior preferences and content preferences; and (iii) how to design an effective training strategy to align ID representations and content representations. To address these challenges, we propose a novel model, self-supervised representation learning with ID-Content modality alignment, named SICSRec. Firstly, we propose a LLM-driven sample construction method and develop a supervised fine-tuning approach to align item-level modality representations. Secondly, we design a novel Transformer-based sequential model, where an ID-modality sequence encoder captures user behavior preferences, a content-modality sequence encoder learns user content preferences, and a mix-modality sequence decoder grasps the intrinsic relationship between these two types of preferences. Thirdly, we propose a two-step training strategy with a content-aware contrastive learning task to align modality representations and ID representations, which decouples the training process of content modality dependency and item collaborative dependency. Extensive experiments conducted on four public video streaming datasets demonstrate our SICSRec outperforms the state-of-the-art ID-modality sequential recommenders and content-modality sequential recommenders by 8.04% on NDCG@5 and 6.62% on NDCD@10 on average, respectively.
Abstract（参考訳）: 逐次レコメンデーション(SR)モデルは、歴史的に相互作用したアイテムIDに基づいてユーザの好みをキャプチャすることが多い。コンテンツベースのシーケンシャルレコメンデーションは、アイテムのテキストおよび視覚的特徴を利用して嗜好学習を強化する、有望な方向として最近登場した。しかし、大きな課題は3つあります。 (i)異なる内容のモダリティ表現間の意味的ギャップを減らす方法二ユーザの行動嗜好とコンテンツ嗜好を協調的にモデル化する方法、及び三)ID表現とコンテンツ表現を整合させる効果的なトレーニング戦略を設計する方法。これらの課題に対処するために,SICSRecというID-Contentモダリティアライメントを用いた自己教師型表現学習モデルを提案する。まず, LLMを用いたサンプル構築手法を提案し, アイテムレベルのモダリティ表現を調整するための教師付き微調整手法を開発した。第2に、ID-モダリティシーケンスエンコーダがユーザの行動嗜好を捉え、コンテンツ-モダリティシーケンスエンコーダがユーザのコンテンツ嗜好を学習し、ミックス-モダリティシーケンスデコーダがこれらの2種類の嗜好の本質的な関係を把握できるトランスフォーマーベースのシーケンシャルモデルを設計する。第3に,コンテント対応のコントラスト学習タスクを用いた2段階学習手法を提案し,コンテンツモダリティ依存とアイテム協調依存のトレーニングプロセスを分離する。 4つの公開ビデオストリーミングデータセットで実施された大規模な実験は、私たちのSICSRecが、最先端のID-モダリティシーケンシャルレコメンデータとコンテンツ-モダリティシーケンシャルレコメンデータを平均8.04%、NDCD@10では6.62%で上回っていることを示している。

論文の概要: Self-Supervised Representation Learning with ID-Content Modality Alignment for Sequential Recommendation

関連論文リスト