Related papers: GUME: Graphs and User Modalities Enhancement for Long-Tail Multimodal Recommendation

GUME: Graphs and User Modalities Enhancement for Long-Tail Multimodal Recommendation

URL: http://arxiv.org/abs/2407.12338v1
Date: Wed, 17 Jul 2024 06:29:00 GMT
Title: GUME: Graphs and User Modalities Enhancement for Long-Tail Multimodal Recommendation
Authors: Guojiao Lin, Zhen Meng, Dongjie Wang, Qingqing Long, Yuanchun Zhou, Meng Xiao,
Abstract summary: We propose a novel Graphs and User Modalities Enhancement (GUME) for long-tail multimodal recommendation. Specifically, we first enhance the user-item graph using multimodal similarity between items. We then construct two types of user modalities: explicit interaction features and extended interest features.
Score: 13.1192216083304
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multimodal recommendation systems (MMRS) have received considerable attention from the research community due to their ability to jointly utilize information from user behavior and product images and text. Previous research has two main issues. First, many long-tail items in recommendation systems have limited interaction data, making it difficult to learn comprehensive and informative representations. However, past MMRS studies have overlooked this issue. Secondly, users' modality preferences are crucial to their behavior. However, previous research has primarily focused on learning item modality representations, while user modality representations have remained relatively simplistic.To address these challenges, we propose a novel Graphs and User Modalities Enhancement (GUME) for long-tail multimodal recommendation. Specifically, we first enhance the user-item graph using multimodal similarity between items. This improves the connectivity of long-tail items and helps them learn high-quality representations through graph propagation. Then, we construct two types of user modalities: explicit interaction features and extended interest features. By using the user modality enhancement strategy to maximize mutual information between these two features, we improve the generalization ability of user modality representations. Additionally, we design an alignment strategy for modality data to remove noise from both internal and external perspectives. Extensive experiments on four publicly available datasets demonstrate the effectiveness of our approach.

Related papers

Graph Contrastive Learning on Multi-label Classification for Recommendations [34.785207813971134]
We propose a model called Graph Contrastive Learning for Multi-label Classification (MCGCL) MCGCL leverages contrastive learning to enhance recommendation effectiveness. We assess the performance using real-world datasets from Amazon Reviews in multi-label classification tasks.
arXiv Detail & Related papers (2025-01-13T00:29:29Z)
Multimodal Difference Learning for Sequential Recommendation [5.243083216855681]
We argue that user interests and item relationships vary across different modalities. We propose a novel Multimodal Learning framework for Sequential Recommendation, MDSRec. Results on five real-world datasets demonstrate the superiority of MDSRec over state-of-the-art baselines.
arXiv Detail & Related papers (2024-12-11T05:08:19Z)
LLM-assisted Explicit and Implicit Multi-interest Learning Framework for Sequential Recommendation [50.98046887582194]
We propose an explicit and implicit multi-interest learning framework to model user interests on two levels: behavior and semantics. The proposed EIMF framework effectively and efficiently combines small models with LLM to improve the accuracy of multi-interest modeling.
arXiv Detail & Related papers (2024-11-14T13:00:23Z)
Triple Modality Fusion: Aligning Visual, Textual, and Graph Data with Large Language Models for Multi-Behavior Recommendations [12.154043062308201]
This paper introduces a novel framework for multi-behavior recommendations, leveraging the fusion of triple-modality. Our proposed model called Triple Modality Fusion (TMF) utilizes the power of large language models (LLMs) to align and integrate these three modalities. Extensive experiments demonstrate the effectiveness of our approach in improving recommendation accuracy.
arXiv Detail & Related papers (2024-10-16T04:44:15Z)
Retrieval Augmentation via User Interest Clustering [57.63883506013693]
Industrial recommender systems are sensitive to the patterns of user-item engagement. We propose a novel approach that efficiently constructs user interest and facilitates low computational cost inference. Our approach has been deployed in multiple products at Meta, facilitating short-form video related recommendation.
arXiv Detail & Related papers (2024-08-07T16:35:10Z)
LLM-ESR: Large Language Models Enhancement for Long-tailed Sequential Recommendation [58.04939553630209]
In real-world systems, most users interact with only a handful of items, while the majority of items are seldom consumed. These two issues, known as the long-tail user and long-tail item challenges, often pose difficulties for existing Sequential Recommendation systems. We propose the Large Language Models Enhancement framework for Sequential Recommendation (LLM-ESR) to address these challenges.
arXiv Detail & Related papers (2024-05-31T07:24:42Z)
Knowledge-Aware Multi-Intent Contrastive Learning for Multi-Behavior Recommendation [6.522900133742931]
Multi-behavioral recommendation provides users with more accurate choices based on diverse behaviors, such as view, add to cart, and purchase. We propose a novel model: Knowledge-Aware Multi-Intent Contrastive Learning (KAMCL) model. This model uses relationships in the knowledge graph to construct intents, aiming to mine the connections between users' multi-behaviors from the perspective of intents to achieve more accurate recommendations.
arXiv Detail & Related papers (2024-04-18T08:39:52Z)
Exploiting Modality-Specific Features For Multi-Modal Manipulation Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks. Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment. We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z)
DEKGCI: A double-sided recommendation model for integrating knowledge graph and user-item interaction graph [0.0]
We propose DEKGCI, a novel double-sided recommendation model. We use the high-order collaborative signals from the user-item interaction graph to enrich the user representations on the user side.
arXiv Detail & Related papers (2023-06-24T01:54:49Z)
Knowledge Enhancement for Multi-Behavior Contrastive Recommendation [39.50243004656453]
We propose a Knowledge Enhancement Multi-Behavior Contrastive Learning Recommendation (KMCLR) framework. In this work, we design the multi-behavior learning module to extract users' personalized behavior information for user-embedding enhancement. In the optimization stage, we model the coarse-grained commonalities and the fine-grained differences between multi-behavior of users to further improve the recommendation effect.
arXiv Detail & Related papers (2023-01-13T06:24:33Z)
Perceptual Score: What Data Modalities Does Your Model Perceive? [73.75255606437808]
We introduce the perceptual score, a metric that assesses the degree to which a model relies on the different subsets of the input features. We find that recent, more accurate multi-modal models for visual question-answering tend to perceive the visual data less than their predecessors. Using the perceptual score also helps to analyze model biases by decomposing the score into data subset contributions.
arXiv Detail & Related papers (2021-10-27T12:19:56Z)
Knowledge-Enhanced Hierarchical Graph Transformer Network for Multi-Behavior Recommendation [56.12499090935242]
This work proposes a Knowledge-Enhanced Hierarchical Graph Transformer Network (KHGT) to investigate multi-typed interactive patterns between users and items in recommender systems. KHGT is built upon a graph-structured neural architecture to capture type-specific behavior characteristics. We show that KHGT consistently outperforms many state-of-the-art recommendation methods across various evaluation settings.
arXiv Detail & Related papers (2021-10-08T09:44:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.