Modeling High-order Interactions across Multi-interests for Micro-video
Reommendation
- URL: http://arxiv.org/abs/2104.00305v1
- Date: Thu, 1 Apr 2021 07:20:15 GMT
- Title: Modeling High-order Interactions across Multi-interests for Micro-video
Reommendation
- Authors: Dong Yao, Shengyu Zhang, Zhou Zhao, Wenyan Fan, Jieming Zhu, Xiuqiang
He, Fei Wu
- Abstract summary: We propose a Self-over-Co Attention module to enhance user's interest representation.
In particular, we first use co-attention to model correlation patterns across different levels and then use self-attention to model correlation patterns within a specific level.
- Score: 65.16624625748068
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Personalized recommendation system has become pervasive in various video
platform. Many effective methods have been proposed, but most of them didn't
capture the user's multi-level interest trait and dependencies between their
viewed micro-videos well. To solve these problems, we propose a Self-over-Co
Attention module to enhance user's interest representation. In particular, we
first use co-attention to model correlation patterns across different levels
and then use self-attention to model correlation patterns within a specific
level. Experimental results on filtered public datasets verify that our
presented module is useful.
Related papers
- MUFM: A Mamba-Enhanced Feedback Model for Micro Video Popularity Prediction [1.7040391128945196]
We introduce a framework for capturing long-term dependencies in user feedback and dynamic event interactions.
Our experiments on the large-scale open-source multi-modal dataset show that our model significantly outperforms state-of-the-art approaches by 23.2%.
arXiv Detail & Related papers (2024-11-23T05:13:27Z) - DiffMM: Multi-Modal Diffusion Model for Recommendation [19.43775593283657]
We propose a novel multi-modal graph diffusion model for recommendation called DiffMM.
Our framework integrates a modality-aware graph diffusion model with a cross-modal contrastive learning paradigm to improve modality-aware user representation learning.
arXiv Detail & Related papers (2024-06-17T17:35:54Z) - Coarse-to-Fine Knowledge-Enhanced Multi-Interest Learning Framework for
Multi-Behavior Recommendation [52.89816309759537]
Multi-types of behaviors (e.g., clicking, adding to cart, purchasing, etc.) widely exist in most real-world recommendation scenarios.
The state-of-the-art multi-behavior models learn behavior dependencies indistinguishably with all historical interactions as input.
We propose a novel Coarse-to-fine Knowledge-enhanced Multi-interest Learning framework to learn shared and behavior-specific interests for different behaviors.
arXiv Detail & Related papers (2022-08-03T05:28:14Z) - Contrastive Meta Learning with Behavior Multiplicity for Recommendation [42.15990960863924]
A well-informed recommendation framework could not only help users identify their interested items, but also benefit the revenue of various online platforms.
We propose Contrastive Meta Learning (CML) to maintain dedicated cross-type behavior dependency for different users.
Our method consistently outperforms various state-of-the-art recommendation methods.
arXiv Detail & Related papers (2022-02-17T08:51:24Z) - Multiple Interest and Fine Granularity Network for User Modeling [3.508126539399186]
User modeling plays a fundamental role in industrial recommender systems, either in the matching stage and the ranking stage, in terms of both the customer experience and business revenue.
Most existing deep-learning based approaches exploit item-ids and category-ids but neglect fine-grained features like color and mate-rial, which hinders modeling the fine granularity of users' interests.
We present Multiple interest and Fine granularity Net-work (MFN), which tackle users' multiple and fine-grained interests and construct the model from both the similarity relationship and the combination relationship among the users' multiple interests.
arXiv Detail & Related papers (2021-12-05T15:12:08Z) - Perceptual Score: What Data Modalities Does Your Model Perceive? [73.75255606437808]
We introduce the perceptual score, a metric that assesses the degree to which a model relies on the different subsets of the input features.
We find that recent, more accurate multi-modal models for visual question-answering tend to perceive the visual data less than their predecessors.
Using the perceptual score also helps to analyze model biases by decomposing the score into data subset contributions.
arXiv Detail & Related papers (2021-10-27T12:19:56Z) - Learning User Representations with Hypercuboids for Recommender Systems [26.80987554753327]
Our model explicitly models user interests as a hypercuboid instead of a point in the space.
We present two variants of hypercuboids to enhance the capability in capturing the diversities of user interests.
A neural architecture is also proposed to facilitate user hypercuboid learning by capturing the activity sequences (e.g., buy and rate) of users.
arXiv Detail & Related papers (2020-11-11T12:50:00Z) - Bayesian Attention Modules [65.52970388117923]
We propose a scalable version of attention that is easy to implement and optimize.
Our experiments show the proposed method brings consistent improvements over the corresponding baselines.
arXiv Detail & Related papers (2020-10-20T20:30:55Z) - Relating by Contrasting: A Data-efficient Framework for Multimodal
Generative Models [86.9292779620645]
We develop a contrastive framework for generative model learning, allowing us to train the model not just by the commonality between modalities, but by the distinction between "related" and "unrelated" multimodal data.
Under our proposed framework, the generative model can accurately identify related samples from unrelated ones, making it possible to make use of the plentiful unlabeled, unpaired multimodal data.
arXiv Detail & Related papers (2020-07-02T15:08:11Z) - Predicting the Popularity of Micro-videos with Multimodal Variational
Encoder-Decoder Framework [54.194340961353944]
We propose a multimodal variational encoder-decoder framework for micro-video popularity tasks.
MMVED learns a prediction embedding of a micro-video that is informative to its popularity level.
Experiments conducted on a public dataset and a dataset we collect from Xigua demonstrate the effectiveness of the proposed MMVED framework.
arXiv Detail & Related papers (2020-03-28T06:08:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.