Fugu-MT 論文翻訳(概要): Diffusion-based Multi-modal Synergy Interest Network for Click-through Rate Prediction

論文の概要: Diffusion-based Multi-modal Synergy Interest Network for Click-through Rate Prediction

arxiv url: http://arxiv.org/abs/2508.21460v1
Date: Fri, 29 Aug 2025 09:46:16 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-01 19:45:10.996268
Title: Diffusion-based Multi-modal Synergy Interest Network for Click-through Rate Prediction
Title（参考訳）: クリックスルーレート予測のための拡散型マルチモーダル干渉ネットワーク
Authors: Xiaoxi Cui, Weihai Lu, Yu Tong, Yiheng Li, Zhejun Zhao,
Abstract要約: クリックスルーレート予測では、ユーザの興味をモデル化するためにクリックスルーレート予測が使用される。既存のCTR予測手法のほとんどは、主にIDモダリティに基づいている。本稿では,Diffusion-based Multi-modal Synergy Interest Network (Diff-MSIN) をクリックスルー予測のためのフレームワークとして提案する。
参考スコア（独自算出の注目度）: 10.958001571669415
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In click-through rate prediction, click-through rate prediction is used to model users' interests. However, most of the existing CTR prediction methods are mainly based on the ID modality. As a result, they are unable to comprehensively model users' multi-modal preferences. Therefore, it is necessary to introduce multi-modal CTR prediction. Although it seems appealing to directly apply the existing multi-modal fusion methods to click-through rate prediction models, these methods (1) fail to effectively disentangle commonalities and specificities across different modalities; (2) fail to consider the synergistic effects between modalities and model the complex interactions between modalities. To address the above issues, this paper proposes the Diffusion-based Multi-modal Synergy Interest Network (Diff-MSIN) framework for click-through prediction. This framework introduces three innovative modules: the Multi-modal Feature Enhancement (MFE) Module Synergistic Relationship Capture (SRC) Module, and the Feature Dynamic Adaptive Fusion (FDAF) Module. The MFE Module and SRC Module extract synergistic, common, and special information among different modalities. They effectively enhances the representation of the modalities, improving the overall quality of the fusion. To encourage distinctiveness among different features, we design a Knowledge Decoupling method. Additionally, the FDAF Module focuses on capturing user preferences and reducing fusion noise. To validate the effectiveness of the Diff-MSIN framework, we conducted extensive experiments using the Rec-Tmall and three Amazon datasets. The results demonstrate that our approach yields a significant improvement of at least 1.67% compared to the baseline, highlighting its potential for enhancing multi-modal recommendation systems. Our code is available at the following link: https://github.com/Cxx-0/Diff-MSIN.
Abstract（参考訳）: クリックスルーレート予測では、ユーザの興味をモデル化するためにクリックスルーレート予測が使用される。しかし、既存のCTR予測手法のほとんどは、主にIDモダリティに基づいている。その結果、ユーザによるマルチモーダルな嗜好を包括的にモデル化することはできない。したがって,マルチモーダルCTR予測を導入する必要がある。既存のマルチモーダル融合法を直接クリックスルーレート予測モデルに適用することは魅力的であるように思われるが、(1) 共通点と特異点を異なるモダリティにわたって効果的に解離させることができず、(2) モダリティ間の相乗効果を考慮せず、モダリティ間の複雑な相互作用をモデル化することができない。上記の課題に対処するため, クリックスルー予測のための拡散型マルチモーダル・シナジー・イントラスト・ネットワーク(Diff-MSIN)フレームワークを提案する。このフレームワークは、MFE(Multi-modal Feature Enhancement)モジュールSRC(Synergistic Relationship Capture)モジュールとFDAF(Feature Dynamic Adaptive Fusion)モジュールという3つの革新的なモジュールを導入している。 MFEモジュールとSRCモジュールは、異なるモジュール間でシナジスティック、共通、および特別な情報を抽出する。これにより、モダリティの表現が効果的に強化され、融合の全体的な品質が向上する。異なる特徴の区別を促進するために,知識分離手法を設計する。さらにFDAFモジュールは、ユーザの好みを捉え、融合ノイズを減らすことに重点を置いている。 Diff-MSINフレームワークの有効性を検証するために、Rec-Tmallと3つのAmazonデータセットを用いて広範な実験を行った。その結果,提案手法はベースラインに比べて少なくとも1.67%の大幅な改善をもたらし,マルチモーダルレコメンデーションシステムの強化の可能性を強調した。私たちのコードは以下のリンクで利用可能です。

論文の概要: Diffusion-based Multi-modal Synergy Interest Network for Click-through Rate Prediction

関連論文リスト