Related papers: Mirror Gradient: Towards Robust Multimodal Recommender Systems via Exploring Flat Local Minima

Mirror Gradient: Towards Robust Multimodal Recommender Systems via Exploring Flat Local Minima

URL: http://arxiv.org/abs/2402.11262v1
Date: Sat, 17 Feb 2024 12:27:30 GMT
Title: Mirror Gradient: Towards Robust Multimodal Recommender Systems via Exploring Flat Local Minima
Authors: Shanshan Zhong, Zhongzhan Huang, Daifeng Li, Wushao Wen, Jinghui Qin, Liang Lin
Abstract summary: We analyze multimodal recommender systems from the novel perspective of flat local minima. We propose a concise yet effective gradient strategy called Mirror Gradient (MG) We find that the proposed MG can complement existing robust training methods and be easily extended to diverse advanced recommendation models.
Score: 54.06000767038741
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multimodal recommender systems utilize various types of information to model user preferences and item features, helping users discover items aligned with their interests. The integration of multimodal information mitigates the inherent challenges in recommender systems, e.g., the data sparsity problem and cold-start issues. However, it simultaneously magnifies certain risks from multimodal information inputs, such as information adjustment risk and inherent noise risk. These risks pose crucial challenges to the robustness of recommendation models. In this paper, we analyze multimodal recommender systems from the novel perspective of flat local minima and propose a concise yet effective gradient strategy called Mirror Gradient (MG). This strategy can implicitly enhance the model's robustness during the optimization process, mitigating instability risks arising from multimodal information inputs. We also provide strong theoretical evidence and conduct extensive empirical experiments to show the superiority of MG across various multimodal recommendation models and benchmarks. Furthermore, we find that the proposed MG can complement existing robust training methods and be easily extended to diverse advanced recommendation models, making it a promising new and fundamental paradigm for training multimodal recommender systems. The code is released at https://github.com/Qrange-group/Mirror-Gradient.

Related papers

Does Multimodality Improve Recommender Systems as Expected? A Critical Analysis and Future Directions [52.21847626165085]
Multimodal recommendation systems are increasingly popular for their potential to improve performance by integrating diverse data types.<n>However, the actual benefits of this integration remain unclear, raising questions about when and how it truly enhances recommendations.<n>We propose a structured evaluation framework to systematically assess multimodal recommendations across four dimensions.
arXiv Detail & Related papers (2025-08-07T13:21:00Z)
FindRec: Stein-Guided Entropic Flow for Multi-Modal Sequential Recommendation [50.438552588818]
We propose textbfFindRec (textbfFlexible unified textbfinformation textbfdisentanglement for multi-modal sequential textbfRecommendation)<n>A Stein kernel-based Integrated Information Coordination Module (IICM) theoretically guarantees distribution consistency between multimodal features and ID streams.<n>A cross-modal expert routing mechanism that adaptively filters and combines multimodal features based on their contextual relevance.
arXiv Detail & Related papers (2025-07-07T04:09:45Z)
Gated Multimodal Graph Learning for Personalized Recommendation [9.466822984141086]
Multimodal recommendation has emerged as a promising solution to alleviate the cold-start and sparsity problems in collaborative filtering.<n>We propose RLMultimodalRec, a lightweight and modular recommendation framework that combines graph-based user modeling with adaptive multimodal item encoding.
arXiv Detail & Related papers (2025-05-30T16:57:17Z)
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation [2.549112678136113]
Retrieval-Augmented Generation (RAG) mitigates issues by integrating external dynamic information for improved factual grounding.<n>Cross-modal alignment and reasoning introduce unique challenges beyond those in unimodal RAG.<n>This survey lays the foundation for developing more capable and reliable AI systems.
arXiv Detail & Related papers (2025-02-12T22:33:41Z)
Multi-modal Retrieval Augmented Multi-modal Generation: Datasets, Evaluation Metrics and Strong Baselines [64.61315565501681]
Multi-modal Retrieval Augmented Multi-modal Generation (M$2$RAG) is a novel task that enables foundation models to process multi-modal web content. Despite its potential impact, M$2$RAG remains understudied, lacking comprehensive analysis and high-quality data resources.
arXiv Detail & Related papers (2024-11-25T13:20:19Z)
RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks. Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs. In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z)
Mitigating Propensity Bias of Large Language Models for Recommender Systems [20.823461673845756]
We introduce a novel framework named Counterfactual LLM Recommendation (CLLMR) We propose a spectrum-based side information encoder that implicitly embeds structural information from historical interactions into the side information representation. Our CLLMR approach explores the causal relationships inherent in LLM-based recommender systems.
arXiv Detail & Related papers (2024-09-30T07:57:13Z)
Large Language Model Empowered Embedding Generator for Sequential Recommendation [57.49045064294086]
Large Language Model (LLM) has the potential to understand the semantic connections between items, regardless of their popularity. We present LLMEmb, an innovative technique that harnesses LLM to create item embeddings that bolster the performance of Sequential Recommender Systems.
arXiv Detail & Related papers (2024-09-30T03:59:06Z)
DLCRec: A Novel Approach for Managing Diversity in LLM-Based Recommender Systems [9.433227503973077]
We propose a novel framework designed to enable fine-grained control over diversity in LLM-based recommendations. Unlike traditional methods, DLCRec adopts a fine-grained task decomposition strategy, breaking down the recommendation process into three sub-tasks. We introduce two data augmentation techniques that enhance the model's robustness to noisy and out-of-distribution data.
arXiv Detail & Related papers (2024-08-22T15:10:56Z)
MMREC: LLM Based Multi-Modal Recommender System [2.3113916776957635]
This paper presents a novel approach to enhancing recommender systems by leveraging Large Language Models (LLMs) and deep learning techniques. The proposed framework aims to improve the accuracy and relevance of recommendations by incorporating multi-modal information processing and by the use of unified latent space representation.
arXiv Detail & Related papers (2024-08-08T04:31:29Z)
DiffMM: Multi-Modal Diffusion Model for Recommendation [19.43775593283657]
We propose a novel multi-modal graph diffusion model for recommendation called DiffMM. Our framework integrates a modality-aware graph diffusion model with a cross-modal contrastive learning paradigm to improve modality-aware user representation learning.
arXiv Detail & Related papers (2024-06-17T17:35:54Z)
Multimodal Recommender Systems: A Survey [50.23505070348051]
Multimodal Recommender System (MRS) has attracted much attention from both academia and industry recently. In this paper, we will give a comprehensive survey of the MRS models, mainly from technical views. To access more details of the surveyed papers, such as implementation code, we open source a repository.
arXiv Detail & Related papers (2023-02-08T05:12:54Z)
RGRecSys: A Toolkit for Robustness Evaluation of Recommender Systems [100.54655931138444]
We propose a more holistic view of robustness for recommender systems that encompasses multiple dimensions. We present a robustness evaluation toolkit, Robustness Gym for RecSys, that allows us to quickly and uniformly evaluate the robustness of recommender system models.
arXiv Detail & Related papers (2022-01-12T10:32:53Z)
MultiHead MultiModal Deep Interest Recommendation Network [0.0]
This paper adds multi-head and multi-modal modules to the DINciteAuthors01 model. Experiments show that the multi-head multi-modal DIN improves the recommendation prediction effect, and outperforms current state-of-the-art methods on various comprehensive indicators.
arXiv Detail & Related papers (2021-10-19T18:59:02Z)
Sequential Recommendation with Self-Attentive Multi-Adversarial Network [101.25533520688654]
We present a Multi-Factor Generative Adversarial Network (MFGAN) for explicitly modeling the effect of context information on sequential recommendation. Our framework is flexible to incorporate multiple kinds of factor information, and is able to trace how each factor contributes to the recommendation decision over time.
arXiv Detail & Related papers (2020-05-21T12:28:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.