Enhancing Live Broadcast Engagement: A Multi-modal Approach to Short Video Recommendations Using MMGCN and User Preferences
- URL: http://arxiv.org/abs/2506.23085v1
- Date: Sun, 29 Jun 2025 04:50:52 GMT
- Title: Enhancing Live Broadcast Engagement: A Multi-modal Approach to Short Video Recommendations Using MMGCN and User Preferences
- Authors: Saeid Aghasoleymani Najafabadi,
- Abstract summary: This paper develops a short video recommendation system that incorporates Multi-modal Graph Convolutional Networks (MMGCN) with user preferences.<n>In order to provide personalized recommendations tailored to individual interests, the proposed system takes into account user interaction data, video content features, and contextual information.<n>Three datasets are used to evaluate the effectiveness of the system: Kwai, TikTok, and MovieLens.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The purpose of this paper is to explore a multi-modal approach to enhancing live broadcast engagement by developing a short video recommendation system that incorporates Multi-modal Graph Convolutional Networks (MMGCN) with user preferences. In order to provide personalized recommendations tailored to individual interests, the proposed system takes into account user interaction data, video content features, and contextual information. With the aid of a hybrid approach combining collaborative filtering and content-based filtering techniques, the system is able to capture nuanced relationships between users, video attributes, and engagement patterns. Three datasets are used to evaluate the effectiveness of the system: Kwai, TikTok, and MovieLens. Compared to baseline models, such as DeepFM, Wide & Deep, LightGBM, and XGBoost, the proposed MMGCN-based model shows superior performance. A notable feature of the proposed model is that it outperforms all baseline methods in capturing diverse user preferences and making accurate, personalized recommendations, resulting in a Kwai F1 score of 0.574, a Tiktok F1 score of 0.506, and a MovieLens F1 score of 0.197. We emphasize the importance of multi-modal integration and user-centric approaches in advancing recommender systems, emphasizing the role they play in enhancing content discovery and audience interaction on live broadcast platforms.
Related papers
- CLaMR: Contextualized Late-Interaction for Multimodal Content Retrieval [70.9990850395981]
We introduce CLaMR, a multimodal, late-interaction retriever that jointly indexes 4 modalities: video frames, transcribed speech, on-screen text, and metadata.<n>CLaMR is trained to enhance dynamic modality selection via two key innovations.
arXiv Detail & Related papers (2025-06-06T15:02:30Z) - Enhancing User Intent for Recommendation Systems via Large Language Models [0.0]
DUIP is a novel framework that combines LSTM networks with Large Language Models (LLMs) to dynamically capture user intent and generate personalized item recommendations.<n>Our findings suggest that DUIP is a promising approach for next-generation recommendation systems, with potential for further improvements in cross-modal recommendations and scalability.
arXiv Detail & Related papers (2025-01-18T20:35:03Z) - RecLM: Recommendation Instruction Tuning [17.780484832381994]
We propose a model-agnostic recommendation instruction-tuning paradigm that seamlessly integrates large language models with collaborative filtering.<n>Our proposed $underlineRec$ommendation enhances the capture of user preference diversity through a carefully designed reinforcement learning reward function.
arXiv Detail & Related papers (2024-12-26T17:51:54Z) - GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation [55.937636188796475]
GaVaMoE is a novel framework for explainable recommendation.<n>It generates tailored explanations for specific user types and preferences.<n>It exhibits robust performance in scenarios with sparse user-item interactions.
arXiv Detail & Related papers (2024-10-15T17:59:30Z) - DiffMM: Multi-Modal Diffusion Model for Recommendation [19.43775593283657]
We propose a novel multi-modal graph diffusion model for recommendation called DiffMM.
Our framework integrates a modality-aware graph diffusion model with a cross-modal contrastive learning paradigm to improve modality-aware user representation learning.
arXiv Detail & Related papers (2024-06-17T17:35:54Z) - A Large Language Model Enhanced Sequential Recommender for Joint Video and Comment Recommendation [77.42486522565295]
We propose a novel recommendation approach called LSVCR to jointly conduct personalized video and comment recommendation.
Our approach consists of two key components, namely sequential recommendation (SR) model and supplemental large language model (LLM) recommender.
In particular, we achieve a significant overall gain of 4.13% in comment watch time.
arXiv Detail & Related papers (2024-03-20T13:14:29Z) - Mirror Gradient: Towards Robust Multimodal Recommender Systems via
Exploring Flat Local Minima [54.06000767038741]
We analyze multimodal recommender systems from the novel perspective of flat local minima.
We propose a concise yet effective gradient strategy called Mirror Gradient (MG)
We find that the proposed MG can complement existing robust training methods and be easily extended to diverse advanced recommendation models.
arXiv Detail & Related papers (2024-02-17T12:27:30Z) - Network-Based Video Recommendation Using Viewing Patterns and Modularity Analysis: An Integrated Framework [1.2289361708127877]
This research introduces a novel approach by combining implicit user data, such as viewing percentages, with social network analysis to enhance personalization in VOD platforms.<n>The system was evaluated on a documentary-focused VOD platform with 328 users over four months.<n>Results showed significant improvements: a 63% increase in click-through rate (CTR), a 24% increase in view completion rate, and a 17% improvement in user satisfaction.
arXiv Detail & Related papers (2023-08-24T12:45:02Z) - Ordinal Graph Gamma Belief Network for Social Recommender Systems [54.9487910312535]
We develop a hierarchical Bayesian model termed ordinal graph factor analysis (OGFA), which jointly models user-item and user-user interactions.
OGFA not only achieves good recommendation performance, but also extracts interpretable latent factors corresponding to representative user preferences.
We extend OGFA to ordinal graph gamma belief network, which is a multi-stochastic-layer deep probabilistic model.
arXiv Detail & Related papers (2022-09-12T09:19:22Z) - Modeling High-order Interactions across Multi-interests for Micro-video
Reommendation [65.16624625748068]
We propose a Self-over-Co Attention module to enhance user's interest representation.
In particular, we first use co-attention to model correlation patterns across different levels and then use self-attention to model correlation patterns within a specific level.
arXiv Detail & Related papers (2021-04-01T07:20:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.