SLIF-MR: Self-loop Iterative Fusion of Heterogeneous Auxiliary Information for Multimodal Recommendation
- URL: http://arxiv.org/abs/2507.09998v1
- Date: Mon, 14 Jul 2025 07:32:16 GMT
- Title: SLIF-MR: Self-loop Iterative Fusion of Heterogeneous Auxiliary Information for Multimodal Recommendation
- Authors: Jie Guo, Jiahao Jiang, Ziyuan Guo, Bin Song, Yue Sun,
- Abstract summary: We propose a novel framework termed Self-loop Iterative Fusion of Heterogeneous Auxiliary Information for Multimodal Recommendation (SLIF-MR)<n>SLIF-MR leverages item representations from previous training epoch as feedback signals to dynamically optimize the heterogeneous graph structures composed of KG, multimodal item feature graph, and user-item interaction graph.<n>Experiments show that SLIF-MR significantly outperforms existing methods, particularly in terms of accuracy and robustness.
- Score: 13.3951304427872
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge graphs (KGs) and multimodal item information, which respectively capture relational and attribute features, play a crucial role in improving recommender system accuracy. Recent studies have attempted to integrate them via multimodal knowledge graphs (MKGs) to further enhance recommendation performance. However, existing methods typically freeze the MKG structure during training, which limits the full integration of structural information from heterogeneous graphs (e.g., KG and user-item interaction graph), and results in sub-optimal performance. To address this challenge, we propose a novel framework, termed Self-loop Iterative Fusion of Heterogeneous Auxiliary Information for Multimodal Recommendation (SLIF-MR), which leverages item representations from previous training epoch as feedback signals to dynamically optimize the heterogeneous graph structures composed of KG, multimodal item feature graph, and user-item interaction graph. Through this iterative fusion mechanism, both user and item representations are refined, thus improving the final recommendation performance. Specifically, based on the feedback item representations, SLIF-MR constructs an item-item correlation graph, then integrated into the establishment process of heterogeneous graphs as additional new structural information in a self-loop manner. Consequently, the internal structures of heterogeneous graphs are updated with the feedback item representations during training. Moreover, a semantic consistency learning strategy is proposed to align heterogeneous item representations across modalities. The experimental results show that SLIF-MR significantly outperforms existing methods, particularly in terms of accuracy and robustness.
Related papers
- Complementarity-driven Representation Learning for Multi-modal Knowledge Graph Completion [0.0]
We propose a novel framework named Mixture of Complementary Modality Experts (MoCME)<n>MoCME consists of a Complementarity-guided Modality Knowledge Fusion (CMKF) module and an Entropy-guided Negative Sampling (EGNS) mechanism.<n>Our MoCME achieves state-of-the-art performance, surpassing existing approaches.
arXiv Detail & Related papers (2025-07-28T08:35:11Z) - KGIF: Optimizing Relation-Aware Recommendations with Knowledge Graph Information Fusion [16.971592142597544]
This study introduces a specialized framework designed to merge entity and relation embeddings explicitly through a tailored self-attention mechanism.<n>This explicit fusion enhances the interplay between user-item interactions and item-attribute relationships, providing a nuanced balance between user-centric and item-centric representations.<n>The contributions of this work include an innovative method for explicit information fusion, improved robustness for sparse knowledge graphs, and the ability to generate explainable recommendations through interpretable path visualization.
arXiv Detail & Related papers (2025-01-07T22:19:15Z) - MCSFF: Multi-modal Consistency and Specificity Fusion Framework for Entity Alignment [7.109735168520378]
Multi-modal entity alignment (MMEA) is essential for enhancing knowledge graphs and improving question-answering systems.
Existing methods often focus on integrating modalities through their complementarity but overlook the specificity of each modality.
We propose the Multi-modal Consistency and Specificity Fusion Framework (MCSFF), which innovatively integrates both complementary and specific aspects of modalities.
arXiv Detail & Related papers (2024-10-18T16:35:25Z) - Learning to Model Graph Structural Information on MLPs via Graph Structure Self-Contrasting [50.181824673039436]
We propose a Graph Structure Self-Contrasting (GSSC) framework that learns graph structural information without message passing.
The proposed framework is based purely on Multi-Layer Perceptrons (MLPs), where the structural information is only implicitly incorporated as prior knowledge.
It first applies structural sparsification to remove potentially uninformative or noisy edges in the neighborhood, and then performs structural self-contrasting in the sparsified neighborhood to learn robust node representations.
arXiv Detail & Related papers (2024-09-09T12:56:02Z) - Gradient Flow of Energy: A General and Efficient Approach for Entity Alignment Decoding [24.613735853099534]
We introduce a novel, generalized, and efficient decoding approach for EA, relying solely on entity embeddings.
Our method optimize the decoding process by minimizing Dirichlet energy, leading to the gradient flow within the graph, to maximize graph homophily.
Notably, the approach achieves these advancements with less than 6 seconds of additional computational time.
arXiv Detail & Related papers (2024-01-23T14:31:12Z) - APGL4SR: A Generic Framework with Adaptive and Personalized Global
Collaborative Information in Sequential Recommendation [86.29366168836141]
We propose a graph-driven framework, named Adaptive and Personalized Graph Learning for Sequential Recommendation (APGL4SR)
APGL4SR incorporates adaptive and personalized global collaborative information into sequential recommendation systems.
As a generic framework, APGL4SR can outperform other baselines with significant margins.
arXiv Detail & Related papers (2023-11-06T01:33:24Z) - MM-GEF: Multi-modal representation meet collaborative filtering [43.88159639990081]
We propose a graph-based item structure enhancement method MM-GEF: Multi-Modal recommendation with Graph Early-Fusion.
MM-GEF learns refined item representations by injecting structural information obtained from both multi-modal and collaborative signals.
arXiv Detail & Related papers (2023-08-14T15:47:36Z) - Improving Knowledge Graph Entity Alignment with Graph Augmentation [11.1094009195297]
Entity alignment (EA) which links equivalent entities across different knowledge graphs (KGs) plays a crucial role in knowledge fusion.
In recent years, graph neural networks (GNNs) have been successfully applied in many embedding-based EA methods.
We propose graph augmentation to create two graph views for margin-based alignment learning and contrastive entity representation learning.
arXiv Detail & Related papers (2023-04-28T01:22:47Z) - Reinforcement Learning based Path Exploration for Sequential Explainable
Recommendation [57.67616822888859]
We propose a novel Temporal Meta-path Guided Explainable Recommendation leveraging Reinforcement Learning (TMER-RL)
TMER-RL utilizes reinforcement item-item path modelling between consecutive items with attention mechanisms to sequentially model dynamic user-item evolutions on dynamic knowledge graph for explainable recommendation.
Extensive evaluations of TMER on two real-world datasets show state-of-the-art performance compared against recent strong baselines.
arXiv Detail & Related papers (2021-11-24T04:34:26Z) - Diversified Multiscale Graph Learning with Graph Self-Correction [55.43696999424127]
We propose a diversified multiscale graph learning model equipped with two core ingredients.
A graph self-correction (GSC) mechanism to generate informative embedded graphs, and a diversity boosting regularizer (DBR) to achieve a comprehensive characterization of the input graph.
Experiments on popular graph classification benchmarks show that the proposed GSC mechanism leads to significant improvements over state-of-the-art graph pooling methods.
arXiv Detail & Related papers (2021-03-17T16:22:24Z) - Pre-training Graph Transformer with Multimodal Side Information for
Recommendation [82.4194024706817]
We propose a pre-training strategy to learn item representations by considering both item side information and their relationships.
We develop a novel sampling algorithm named MCNSampling to select contextual neighbors for each item.
The proposed Pre-trained Multimodal Graph Transformer (PMGT) learns item representations with two objectives: 1) graph structure reconstruction, and 2) masked node feature reconstruction.
arXiv Detail & Related papers (2020-10-23T10:30:24Z) - Graph Representation Learning via Graphical Mutual Information
Maximization [86.32278001019854]
We propose a novel concept, Graphical Mutual Information (GMI), to measure the correlation between input graphs and high-level hidden representations.
We develop an unsupervised learning model trained by maximizing GMI between the input and output of a graph neural encoder.
arXiv Detail & Related papers (2020-02-04T08:33:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.