Related papers: GIFT: Graph-guIded Feature Transfer for Cold-Start Video Click-Through Rate Prediction

GIFT: Graph-guIded Feature Transfer for Cold-Start Video Click-Through Rate Prediction

URL: http://arxiv.org/abs/2202.11525v1
Date: Mon, 21 Feb 2022 09:31:35 GMT
Title: GIFT: Graph-guIded Feature Transfer for Cold-Start Video Click-Through Rate Prediction
Authors: Sihao Hu, Yi Cao, Yu Gong, Zhao Li, Yazheng Yang, Qingwen Liu, Wengwu Ou, Shouling Ji
Abstract summary: Short video has witnessed rapid growth in China and shows a promising market for promoting the sales of products in e-commerce platforms like Taobao. To ensure the freshness of the content, the platform needs to release a large number of new videos every day. We propose GIFT, an efficient Graph-guIded Feature Transfer system, to take advantages of the rich information of warmed-up videos that related to the cold-start video.
Score: 47.06479882277151
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Short video has witnessed rapid growth in China and shows a promising market for promoting the sales of products in e-commerce platforms like Taobao. To ensure the freshness of the content, the platform needs to release a large number of new videos every day, which makes the conventional click-through rate (CTR) prediction model suffer from the severe item cold-start problem. In this paper, we propose GIFT, an efficient Graph-guIded Feature Transfer system, to fully take advantages of the rich information of warmed-up videos that related to the cold-start video. More specifically, we conduct feature transfer from warmed-up videos to those cold-start ones by involving the physical and semantic linkages into a heterogeneous graph. The former linkages consist of those explicit relationships (e.g., sharing the same category, under the same authorship etc.), while the latter measure the proximity of multimodal representations of two videos. In practice, the style, content, and even the recommendation pattern are pretty similar among those physically or semantically related videos. Besides, in order to provide the robust id representations and historical statistics obtained from warmed-up neighbors that cold-start videos covet most, we elaborately design the transfer function to make aware of different transferred features from different types of nodes and edges along the metapath on the graph. Extensive experiments on a large real-world dataset show that our GIFT system outperforms SOTA methods significantly and brings a 6.82% lift on click-through rate (CTR) in the homepage of Taobao App.

Related papers

ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts [56.75723197779384]
ARC-Hunyuan-Video is a multimodal model that processes visual, audio, and textual signals end-to-end for structured comprehension.<n>Our model is capable of multi-granularity timestamped video captioning and summarization, open-ended video question answering, temporal video grounding, and video reasoning.
arXiv Detail & Related papers (2025-07-28T15:52:36Z)
Short-video Propagation Influence Rating: A New Real-world Dataset and A New Large Graph Model [55.58701436630489]
Cross-platform Short-Video dataset includes 117,720 videos, 381,926 samples, and 535 topics across 5 biggest Chinese platforms. Large Graph Model (LGM) named NetGPT can bridge heterogeneous graph-structured data with the powerful reasoning ability and knowledge of Large Language Models (LLMs) Our NetGPT can comprehend and analyze the short-video propagation graph, enabling it to predict the long-term propagation influence of short-videos.
arXiv Detail & Related papers (2025-03-31T05:53:15Z)
Firzen: Firing Strict Cold-Start Items with Frozen Heterogeneous and Homogeneous Graphs for Recommendation [34.414081170244955]
We propose a unified framework incorporating multi-modal content of items and knowledge graphs (KGs) to solve both strict cold-start and warm-start recommendation. Our model yields significant improvements for strict cold-start recommendation and outperforms or matches the state-of-the-art performance in the warm-start scenario.
arXiv Detail & Related papers (2024-10-10T06:48:27Z)
Neural Graph Matching for Video Retrieval in Large-Scale Video-driven E-commerce [5.534002182451785]
Video-driven e-commerce has shown huge potential in stimulating consumer confidence and promoting sales. We propose a novel bi-level Graph Matching Network (GMN), which mainly consists of node- and preference-level graph matching. Comprehensive experiments show the superiority of the proposed GMN with significant improvements over state-of-the-art approaches.
arXiv Detail & Related papers (2024-08-01T07:31:23Z)
SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction [93.26613503521664]
This paper presents a short-to-long video diffusion model, SEINE, that focuses on generative transition and prediction. We propose a random-mask video diffusion model to automatically generate transitions based on textual descriptions. Our model generates transition videos that ensure coherence and visual quality.
arXiv Detail & Related papers (2023-10-31T17:58:17Z)
EgoViT: Pyramid Video Transformer for Egocentric Action Recognition [18.05706639179499]
Capturing interaction of hands with objects is important to autonomously detect human actions from egocentric videos. We present a pyramid video transformer with a dynamic class token generator for egocentric action recognition.
arXiv Detail & Related papers (2023-03-15T20:33:50Z)
Privileged Graph Distillation for Cold Start Recommendation [57.918041397089254]
The cold start problem in recommender systems requires recommending to new users (items) based on attributes without any historical interaction records. We propose a privileged graph distillation model(PGD) Our proposed model is generally applicable to different cold start scenarios with new user, new item, or new user-new item.
arXiv Detail & Related papers (2021-05-31T14:05:27Z)
Pre-training Graph Transformer with Multimodal Side Information for Recommendation [82.4194024706817]
We propose a pre-training strategy to learn item representations by considering both item side information and their relationships. We develop a novel sampling algorithm named MCNSampling to select contextual neighbors for each item. The proposed Pre-trained Multimodal Graph Transformer (PMGT) learns item representations with two objectives: 1) graph structure reconstruction, and 2) masked node feature reconstruction.
arXiv Detail & Related papers (2020-10-23T10:30:24Z)
Understanding Road Layout from Videos as a Whole [82.30800791500869]
We formulate it as a top-view road attributes prediction problem and our goal is to predict these attributes for each frame both accurately and consistently. We exploit the following three novel aspects: leveraging camera motions in videos, including context cuesand incorporating long-term video information.
arXiv Detail & Related papers (2020-07-02T00:59:15Z)
Comprehensive Information Integration Modeling Framework for Video Titling [124.11296128308396]
We integrate comprehensive sources of information, including the content of consumer-generated videos, the narrative comment sentences supplied by consumers, and the product attributes, in an end-to-end modeling framework. To tackle this issue, the proposed method consists of two processes, i.e., granular-level interaction modeling and abstraction-level story-line summarization. We collect a large-scale dataset accordingly from real-world data in Taobao, a world-leading e-commerce platform.
arXiv Detail & Related papers (2020-06-24T10:38:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.