GUME: Graphs and User Modalities Enhancement for Long-Tail Multimodal Recommendation
- URL: http://arxiv.org/abs/2407.12338v1
- Date: Wed, 17 Jul 2024 06:29:00 GMT
- Title: GUME: Graphs and User Modalities Enhancement for Long-Tail Multimodal Recommendation
- Authors: Guojiao Lin, Zhen Meng, Dongjie Wang, Qingqing Long, Yuanchun Zhou, Meng Xiao,
- Abstract summary: We propose a novel Graphs and User Modalities Enhancement (GUME) for long-tail multimodal recommendation.
Specifically, we first enhance the user-item graph using multimodal similarity between items.
We then construct two types of user modalities: explicit interaction features and extended interest features.
- Score: 13.1192216083304
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimodal recommendation systems (MMRS) have received considerable attention from the research community due to their ability to jointly utilize information from user behavior and product images and text. Previous research has two main issues. First, many long-tail items in recommendation systems have limited interaction data, making it difficult to learn comprehensive and informative representations. However, past MMRS studies have overlooked this issue. Secondly, users' modality preferences are crucial to their behavior. However, previous research has primarily focused on learning item modality representations, while user modality representations have remained relatively simplistic.To address these challenges, we propose a novel Graphs and User Modalities Enhancement (GUME) for long-tail multimodal recommendation. Specifically, we first enhance the user-item graph using multimodal similarity between items. This improves the connectivity of long-tail items and helps them learn high-quality representations through graph propagation. Then, we construct two types of user modalities: explicit interaction features and extended interest features. By using the user modality enhancement strategy to maximize mutual information between these two features, we improve the generalization ability of user modality representations. Additionally, we design an alignment strategy for modality data to remove noise from both internal and external perspectives. Extensive experiments on four publicly available datasets demonstrate the effectiveness of our approach.
Related papers
- MMBee: Live Streaming Gift-Sending Recommendations via Multi-Modal Fusion and Behaviour Expansion [18.499672566131355]
Accurately modeling the gifting interaction not only enhances users' experience but also increases streamers' revenue.
Previous studies on live streaming gifting prediction treat this task as a conventional recommendation problem.
We propose MMBee based on real-time Multi-Modal Fusion and Behaviour Expansion to address these issues.
arXiv Detail & Related papers (2024-06-15T04:59:00Z) - Knowledge-Aware Multi-Intent Contrastive Learning for Multi-Behavior Recommendation [6.522900133742931]
Multi-behavioral recommendation provides users with more accurate choices based on diverse behaviors, such as view, add to cart, and purchase.
We propose a novel model: Knowledge-Aware Multi-Intent Contrastive Learning (KAMCL) model.
This model uses relationships in the knowledge graph to construct intents, aiming to mine the connections between users' multi-behaviors from the perspective of intents to achieve more accurate recommendations.
arXiv Detail & Related papers (2024-04-18T08:39:52Z) - Exploiting Modality-Specific Features For Multi-Modal Manipulation
Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks.
Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment.
We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z) - DEKGCI: A double-sided recommendation model for integrating knowledge
graph and user-item interaction graph [0.0]
We propose DEKGCI, a novel double-sided recommendation model.
We use the high-order collaborative signals from the user-item interaction graph to enrich the user representations on the user side.
arXiv Detail & Related papers (2023-06-24T01:54:49Z) - Knowledge Enhancement for Multi-Behavior Contrastive Recommendation [39.50243004656453]
We propose a Knowledge Enhancement Multi-Behavior Contrastive Learning Recommendation (KMCLR) framework.
In this work, we design the multi-behavior learning module to extract users' personalized behavior information for user-embedding enhancement.
In the optimization stage, we model the coarse-grained commonalities and the fine-grained differences between multi-behavior of users to further improve the recommendation effect.
arXiv Detail & Related papers (2023-01-13T06:24:33Z) - Coarse-to-Fine Knowledge-Enhanced Multi-Interest Learning Framework for
Multi-Behavior Recommendation [52.89816309759537]
Multi-types of behaviors (e.g., clicking, adding to cart, purchasing, etc.) widely exist in most real-world recommendation scenarios.
The state-of-the-art multi-behavior models learn behavior dependencies indistinguishably with all historical interactions as input.
We propose a novel Coarse-to-fine Knowledge-enhanced Multi-interest Learning framework to learn shared and behavior-specific interests for different behaviors.
arXiv Detail & Related papers (2022-08-03T05:28:14Z) - Perceptual Score: What Data Modalities Does Your Model Perceive? [73.75255606437808]
We introduce the perceptual score, a metric that assesses the degree to which a model relies on the different subsets of the input features.
We find that recent, more accurate multi-modal models for visual question-answering tend to perceive the visual data less than their predecessors.
Using the perceptual score also helps to analyze model biases by decomposing the score into data subset contributions.
arXiv Detail & Related papers (2021-10-27T12:19:56Z) - Knowledge-Enhanced Hierarchical Graph Transformer Network for
Multi-Behavior Recommendation [56.12499090935242]
This work proposes a Knowledge-Enhanced Hierarchical Graph Transformer Network (KHGT) to investigate multi-typed interactive patterns between users and items in recommender systems.
KHGT is built upon a graph-structured neural architecture to capture type-specific behavior characteristics.
We show that KHGT consistently outperforms many state-of-the-art recommendation methods across various evaluation settings.
arXiv Detail & Related papers (2021-10-08T09:44:00Z) - Hyper Meta-Path Contrastive Learning for Multi-Behavior Recommendation [61.114580368455236]
User purchasing prediction with multi-behavior information remains a challenging problem for current recommendation systems.
We propose the concept of hyper meta-path to construct hyper meta-paths or hyper meta-graphs to explicitly illustrate the dependencies among different behaviors of a user.
Thanks to the recent success of graph contrastive learning, we leverage it to learn embeddings of user behavior patterns adaptively instead of assigning a fixed scheme to understand the dependencies among different behaviors.
arXiv Detail & Related papers (2021-09-07T04:28:09Z) - Modeling High-order Interactions across Multi-interests for Micro-video
Reommendation [65.16624625748068]
We propose a Self-over-Co Attention module to enhance user's interest representation.
In particular, we first use co-attention to model correlation patterns across different levels and then use self-attention to model correlation patterns within a specific level.
arXiv Detail & Related papers (2021-04-01T07:20:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.