A Vlogger-augmented Graph Neural Network Model for Micro-video Recommendation
- URL: http://arxiv.org/abs/2405.18260v1
- Date: Tue, 28 May 2024 15:13:29 GMT
- Title: A Vlogger-augmented Graph Neural Network Model for Micro-video Recommendation
- Authors: Weijiang Lai, Beihong Jin, Beibei Li, Yiyuan Zheng, Rui Zhao,
- Abstract summary: We propose a vlogger-augmented graph neural network model VA-GNN, which takes the effect of vloggers into consideration.
Specifically, we construct a tripartite graph with users, micro-videos, and vloggers as nodes, capturing user preferences from different views.
When predicting the next user-video interaction, we adaptively combine the user preferences for a video itself and its vlogger.
- Score: 7.54949302096348
- License:
- Abstract: Existing micro-video recommendation models exploit the interactions between users and micro-videos and/or multi-modal information of micro-videos to predict the next micro-video a user will watch, ignoring the information related to vloggers, i.e., the producers of micro-videos. However, in micro-video scenarios, vloggers play a significant role in user-video interactions, since vloggers generally focus on specific topics and users tend to follow the vloggers they are interested in. Therefore, in the paper, we propose a vlogger-augmented graph neural network model VA-GNN, which takes the effect of vloggers into consideration. Specifically, we construct a tripartite graph with users, micro-videos, and vloggers as nodes, capturing user preferences from different views, i.e., the video-view and the vlogger-view. Moreover, we conduct cross-view contrastive learning to keep the consistency between node embeddings from the two different views. Besides, when predicting the next user-video interaction, we adaptively combine the user preferences for a video itself and its vlogger. We conduct extensive experiments on two real-world datasets. The experimental results show that VA-GNN outperforms multiple existing GNN-based recommendation models.
Related papers
- Vlogger: Make Your Dream A Vlog [67.50445251570173]
Vlogger is a generic AI system for generating a minute-level video blog (i.e., vlog) of user descriptions.
We invoke various foundation models to play the critical roles of vlog professionals, including Script, (2) Actor, (3) ShowMaker, and (4) Voicer.
Vlogger can generate over 5-minute vlogs from open-world descriptions, without loss of video coherence on script and actor.
arXiv Detail & Related papers (2024-01-17T18:55:12Z) - Micro-video Tagging via Jointly Modeling Social Influence and Tag
Relation [56.23157334014773]
85.7% of micro-videos lack annotation.
Existing methods mostly focus on analyzing video content, neglecting users' social influence and tag relation.
We formulate micro-video tagging as a link prediction problem in a constructed heterogeneous network.
arXiv Detail & Related papers (2023-03-15T02:13:34Z) - Towards Micro-video Thumbnail Selection via a Multi-label
Visual-semantic Embedding Model [0.0]
The thumbnail, as the first sight of a micro-video, plays a pivotal role in attracting users to click and watch.
We present a multi-label visual-semantic embedding model to estimate the similarity between the pair of each frame and the popular topics that users are interested in.
We fuse the visual representation score and the popularity score of each frame to select the attractive thumbnail for the given micro-video.
arXiv Detail & Related papers (2022-02-07T04:15:26Z) - Concept-Aware Denoising Graph Neural Network for Micro-Video
Recommendation [30.67251766249372]
We propose a novel concept-aware denoising graph neural network (named CONDE) for micro-video recommendation.
The proposed CONDE achieves significantly better recommendation performance than the existing state-of-the-art solutions.
arXiv Detail & Related papers (2021-09-28T07:02:52Z) - A Behavior-aware Graph Convolution Network Model for Video
Recommendation [9.589431810005774]
We present a model named Sagittarius to capture the influence between users and videos.
Sagittarius differentiates between different user behaviors by weighting.
It then fuses the semantics of user behaviors into the embeddings of users and videos.
arXiv Detail & Related papers (2021-06-27T08:24:45Z) - Multiview Pseudo-Labeling for Semi-supervised Learning from Video [102.36355560553402]
We present a novel framework that uses complementary views in the form of appearance and motion information for semi-supervised learning in video.
Our method capitalizes on multiple views, but it nonetheless trains a model that is shared across appearance and motion input.
On multiple video recognition datasets, our method substantially outperforms its supervised counterpart, and compares favorably to previous work on standard benchmarks in self-supervised video representation learning.
arXiv Detail & Related papers (2021-04-01T17:59:48Z) - Modeling High-order Interactions across Multi-interests for Micro-video
Reommendation [65.16624625748068]
We propose a Self-over-Co Attention module to enhance user's interest representation.
In particular, we first use co-attention to model correlation patterns across different levels and then use self-attention to model correlation patterns within a specific level.
arXiv Detail & Related papers (2021-04-01T07:20:15Z) - Predicting the Popularity of Micro-videos with Multimodal Variational
Encoder-Decoder Framework [54.194340961353944]
We propose a multimodal variational encoder-decoder framework for micro-video popularity tasks.
MMVED learns a prediction embedding of a micro-video that is informative to its popularity level.
Experiments conducted on a public dataset and a dataset we collect from Xigua demonstrate the effectiveness of the proposed MMVED framework.
arXiv Detail & Related papers (2020-03-28T06:08:16Z) - Multimodal Matching Transformer for Live Commenting [97.06576354830736]
Automatic live commenting aims to provide real-time comments on videos for viewers.
Recent work on this task adopts encoder-decoder models to generate comments.
We propose a multimodal matching transformer to capture the relationships among comments, vision, and audio.
arXiv Detail & Related papers (2020-02-07T07:19:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.