Attention-based sequential recommendation system using multimodal data
- URL: http://arxiv.org/abs/2405.17959v1
- Date: Tue, 28 May 2024 08:41:05 GMT
- Title: Attention-based sequential recommendation system using multimodal data
- Authors: Hyungtaik Oh, Wonkeun Jo, Dongil Kim,
- Abstract summary: We propose an attention-based sequential recommendation method that employs multimodal data of items such as images, texts, and categories.
The experimental results obtained from the Amazon datasets show that the proposed method outperforms those of conventional sequential recommendation systems.
- Score: 8.110978727364397
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sequential recommendation systems that model dynamic preferences based on a use's past behavior are crucial to e-commerce. Recent studies on these systems have considered various types of information such as images and texts. However, multimodal data have not yet been utilized directly to recommend products to users. In this study, we propose an attention-based sequential recommendation method that employs multimodal data of items such as images, texts, and categories. First, we extract image and text features from pre-trained VGG and BERT and convert categories into multi-labeled forms. Subsequently, attention operations are performed independent of the item sequence and multimodal representations. Finally, the individual attention information is integrated through an attention fusion function. In addition, we apply multitask learning loss for each modality to improve the generalization performance. The experimental results obtained from the Amazon datasets show that the proposed method outperforms those of conventional sequential recommendation systems.
Related papers
- BiVRec: Bidirectional View-based Multimodal Sequential Recommendation [55.87443627659778]
We propose an innovative framework, BivRec, that jointly trains the recommendation tasks in both ID and multimodal views.
BivRec achieves state-of-the-art performance on five datasets and showcases various practical advantages.
arXiv Detail & Related papers (2024-02-27T09:10:41Z) - Exploiting Modality-Specific Features For Multi-Modal Manipulation
Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks.
Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment.
We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z) - MISSRec: Pre-training and Transferring Multi-modal Interest-aware
Sequence Representation for Recommendation [61.45986275328629]
We propose MISSRec, a multi-modal pre-training and transfer learning framework for sequential recommendation.
On the user side, we design a Transformer-based encoder-decoder model, where the contextual encoder learns to capture the sequence-level multi-modal user interests.
On the candidate item side, we adopt a dynamic fusion module to produce user-adaptive item representation.
arXiv Detail & Related papers (2023-08-22T04:06:56Z) - MM-GEF: Multi-modal representation meet collaborative filtering [51.04679619309803]
We propose a graph-based item structure enhancement method MM-GEF: Multi-Modal recommendation with Graph Early-Fusion.
MM-GEF learns refined item representations by injecting structural information obtained from both multi-modal and collaborative signals.
arXiv Detail & Related papers (2023-08-14T15:47:36Z) - Multi-Modal Recommendation System with Auxiliary Information [0.47267770920095536]
This study extends the existing research by evaluating a multi-modal recommendation system that exploits the inclusion of comprehensive auxiliary knowledge related to an item.
The analysis of the experimental results shows a statistically significant improvement in prediction accuracy, which confirms the effectiveness of including auxiliary information in a context-aware recommendation system.
arXiv Detail & Related papers (2022-10-13T20:20:49Z) - A Sequence-Aware Recommendation Method Based on Complex Networks [1.385805101975528]
We build a network model from data and then use it to predict the user's subsequent actions.
The proposed method is implemented and tested experimentally on a large dataset.
arXiv Detail & Related papers (2022-09-30T16:34:39Z) - Multi-Behavior Sequential Recommendation with Temporal Graph Transformer [66.10169268762014]
We tackle the dynamic user-item relation learning with the awareness of multi-behavior interactive patterns.
We propose a new Temporal Graph Transformer (TGT) recommendation framework to jointly capture dynamic short-term and long-range user-item interactive patterns.
arXiv Detail & Related papers (2022-06-06T15:42:54Z) - Knowledge-Enhanced Hierarchical Graph Transformer Network for
Multi-Behavior Recommendation [56.12499090935242]
This work proposes a Knowledge-Enhanced Hierarchical Graph Transformer Network (KHGT) to investigate multi-typed interactive patterns between users and items in recommender systems.
KHGT is built upon a graph-structured neural architecture to capture type-specific behavior characteristics.
We show that KHGT consistently outperforms many state-of-the-art recommendation methods across various evaluation settings.
arXiv Detail & Related papers (2021-10-08T09:44:00Z) - Mining Latent Structures for Multimedia Recommendation [46.70109406399858]
We propose a LATent sTructure mining method for multImodal reCommEndation, which we term LATTICE for brevity.
We learn item-item structures for each modality and aggregates multiple modalities to obtain latent item graphs.
Based on the learned latent graphs, we perform graph convolutions to explicitly inject high-order item affinities into item representations.
arXiv Detail & Related papers (2021-04-19T03:50:24Z) - Diversity Regularized Interests Modeling for Recommender Systems [25.339169652217844]
We propose a novel method of Diversity Regularized Interests Modeling (DRIM) for Recommender Systems.
Each interest of the user should have a certain degree of distinction, thus we introduce three strategies as the diversity regularized separator to separate multiple user interest vectors.
arXiv Detail & Related papers (2021-03-23T09:10:37Z) - Large Scale Multimodal Classification Using an Ensemble of Transformer
Models and Co-Attention [2.842794675894731]
We describe our methodology and results for the SIGIR eCom Rakuten Data Challenge.
We employ a dual attention technique to model image-text relationships using pretrained language and image embeddings.
arXiv Detail & Related papers (2020-11-23T21:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.