Mirror Gradient: Towards Robust Multimodal Recommender Systems via
Exploring Flat Local Minima
- URL: http://arxiv.org/abs/2402.11262v1
- Date: Sat, 17 Feb 2024 12:27:30 GMT
- Title: Mirror Gradient: Towards Robust Multimodal Recommender Systems via
Exploring Flat Local Minima
- Authors: Shanshan Zhong, Zhongzhan Huang, Daifeng Li, Wushao Wen, Jinghui Qin,
Liang Lin
- Abstract summary: We analyze multimodal recommender systems from the novel perspective of flat local minima.
We propose a concise yet effective gradient strategy called Mirror Gradient (MG)
We find that the proposed MG can complement existing robust training methods and be easily extended to diverse advanced recommendation models.
- Score: 54.06000767038741
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimodal recommender systems utilize various types of information to model
user preferences and item features, helping users discover items aligned with
their interests. The integration of multimodal information mitigates the
inherent challenges in recommender systems, e.g., the data sparsity problem and
cold-start issues. However, it simultaneously magnifies certain risks from
multimodal information inputs, such as information adjustment risk and inherent
noise risk. These risks pose crucial challenges to the robustness of
recommendation models. In this paper, we analyze multimodal recommender systems
from the novel perspective of flat local minima and propose a concise yet
effective gradient strategy called Mirror Gradient (MG). This strategy can
implicitly enhance the model's robustness during the optimization process,
mitigating instability risks arising from multimodal information inputs. We
also provide strong theoretical evidence and conduct extensive empirical
experiments to show the superiority of MG across various multimodal
recommendation models and benchmarks. Furthermore, we find that the proposed MG
can complement existing robust training methods and be easily extended to
diverse advanced recommendation models, making it a promising new and
fundamental paradigm for training multimodal recommender systems. The code is
released at https://github.com/Qrange-group/Mirror-Gradient.
Related papers
- RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks.
Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs.
In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z) - Mitigating Propensity Bias of Large Language Models for Recommender Systems [20.823461673845756]
We introduce a novel framework named Counterfactual LLM Recommendation (CLLMR)
We propose a spectrum-based side information encoder that implicitly embeds structural information from historical interactions into the side information representation.
Our CLLMR approach explores the causal relationships inherent in LLM-based recommender systems.
arXiv Detail & Related papers (2024-09-30T07:57:13Z) - Large Language Model Empowered Embedding Generator for Sequential Recommendation [57.49045064294086]
Large Language Model (LLM) has the potential to understand the semantic connections between items, regardless of their popularity.
We present LLMEmb, an innovative technique that harnesses LLM to create item embeddings that bolster the performance of Sequential Recommender Systems.
arXiv Detail & Related papers (2024-09-30T03:59:06Z) - DLCRec: A Novel Approach for Managing Diversity in LLM-Based Recommender Systems [9.433227503973077]
We propose a novel framework designed to enable fine-grained control over diversity in LLM-based recommendations.
Unlike traditional methods, DLCRec adopts a fine-grained task decomposition strategy, breaking down the recommendation process into three sub-tasks.
We introduce two data augmentation techniques that enhance the model's robustness to noisy and out-of-distribution data.
arXiv Detail & Related papers (2024-08-22T15:10:56Z) - MMREC: LLM Based Multi-Modal Recommender System [2.3113916776957635]
This paper presents a novel approach to enhancing recommender systems by leveraging Large Language Models (LLMs) and deep learning techniques.
The proposed framework aims to improve the accuracy and relevance of recommendations by incorporating multi-modal information processing and by the use of unified latent space representation.
arXiv Detail & Related papers (2024-08-08T04:31:29Z) - DiffMM: Multi-Modal Diffusion Model for Recommendation [19.43775593283657]
We propose a novel multi-modal graph diffusion model for recommendation called DiffMM.
Our framework integrates a modality-aware graph diffusion model with a cross-modal contrastive learning paradigm to improve modality-aware user representation learning.
arXiv Detail & Related papers (2024-06-17T17:35:54Z) - Multimodal Recommender Systems: A Survey [50.23505070348051]
Multimodal Recommender System (MRS) has attracted much attention from both academia and industry recently.
In this paper, we will give a comprehensive survey of the MRS models, mainly from technical views.
To access more details of the surveyed papers, such as implementation code, we open source a repository.
arXiv Detail & Related papers (2023-02-08T05:12:54Z) - RGRecSys: A Toolkit for Robustness Evaluation of Recommender Systems [100.54655931138444]
We propose a more holistic view of robustness for recommender systems that encompasses multiple dimensions.
We present a robustness evaluation toolkit, Robustness Gym for RecSys, that allows us to quickly and uniformly evaluate the robustness of recommender system models.
arXiv Detail & Related papers (2022-01-12T10:32:53Z) - MultiHead MultiModal Deep Interest Recommendation Network [0.0]
This paper adds multi-head and multi-modal modules to the DINciteAuthors01 model.
Experiments show that the multi-head multi-modal DIN improves the recommendation prediction effect, and outperforms current state-of-the-art methods on various comprehensive indicators.
arXiv Detail & Related papers (2021-10-19T18:59:02Z) - Sequential Recommendation with Self-Attentive Multi-Adversarial Network [101.25533520688654]
We present a Multi-Factor Generative Adversarial Network (MFGAN) for explicitly modeling the effect of context information on sequential recommendation.
Our framework is flexible to incorporate multiple kinds of factor information, and is able to trace how each factor contributes to the recommendation decision over time.
arXiv Detail & Related papers (2020-05-21T12:28:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.