GIFT: Graph-guIded Feature Transfer for Cold-Start Video Click-Through
Rate Prediction
- URL: http://arxiv.org/abs/2202.11525v1
- Date: Mon, 21 Feb 2022 09:31:35 GMT
- Title: GIFT: Graph-guIded Feature Transfer for Cold-Start Video Click-Through
Rate Prediction
- Authors: Sihao Hu, Yi Cao, Yu Gong, Zhao Li, Yazheng Yang, Qingwen Liu, Wengwu
Ou, Shouling Ji
- Abstract summary: Short video has witnessed rapid growth in China and shows a promising market for promoting the sales of products in e-commerce platforms like Taobao.
To ensure the freshness of the content, the platform needs to release a large number of new videos every day.
We propose GIFT, an efficient Graph-guIded Feature Transfer system, to take advantages of the rich information of warmed-up videos that related to the cold-start video.
- Score: 47.06479882277151
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Short video has witnessed rapid growth in China and shows a promising market
for promoting the sales of products in e-commerce platforms like Taobao. To
ensure the freshness of the content, the platform needs to release a large
number of new videos every day, which makes the conventional click-through rate
(CTR) prediction model suffer from the severe item cold-start problem. In this
paper, we propose GIFT, an efficient Graph-guIded Feature Transfer system, to
fully take advantages of the rich information of warmed-up videos that related
to the cold-start video. More specifically, we conduct feature transfer from
warmed-up videos to those cold-start ones by involving the physical and
semantic linkages into a heterogeneous graph. The former linkages consist of
those explicit relationships (e.g., sharing the same category, under the same
authorship etc.), while the latter measure the proximity of multimodal
representations of two videos. In practice, the style, content, and even the
recommendation pattern are pretty similar among those physically or
semantically related videos. Besides, in order to provide the robust id
representations and historical statistics obtained from warmed-up neighbors
that cold-start videos covet most, we elaborately design the transfer function
to make aware of different transferred features from different types of nodes
and edges along the metapath on the graph. Extensive experiments on a large
real-world dataset show that our GIFT system outperforms SOTA methods
significantly and brings a 6.82% lift on click-through rate (CTR) in the
homepage of Taobao App.
Related papers
- Neural Graph Matching for Video Retrieval in Large-Scale Video-driven E-commerce [5.534002182451785]
Video-driven e-commerce has shown huge potential in stimulating consumer confidence and promoting sales.
We propose a novel bi-level Graph Matching Network (GMN), which mainly consists of node- and preference-level graph matching.
Comprehensive experiments show the superiority of the proposed GMN with significant improvements over state-of-the-art approaches.
arXiv Detail & Related papers (2024-08-01T07:31:23Z) - SEINE: Short-to-Long Video Diffusion Model for Generative Transition and
Prediction [93.26613503521664]
This paper presents a short-to-long video diffusion model, SEINE, that focuses on generative transition and prediction.
We propose a random-mask video diffusion model to automatically generate transitions based on textual descriptions.
Our model generates transition videos that ensure coherence and visual quality.
arXiv Detail & Related papers (2023-10-31T17:58:17Z) - ViGT: Proposal-free Video Grounding with Learnable Token in Transformer [28.227291816020646]
Video grounding task aims to locate queried action or event in an untrimmed video based on rich linguistic descriptions.
Existing proposal-free methods are trapped in complex interaction between video and query.
We propose a novel boundary regression paradigm that performs regression token learning in a transformer.
arXiv Detail & Related papers (2023-08-11T08:30:08Z) - EgoViT: Pyramid Video Transformer for Egocentric Action Recognition [18.05706639179499]
Capturing interaction of hands with objects is important to autonomously detect human actions from egocentric videos.
We present a pyramid video transformer with a dynamic class token generator for egocentric action recognition.
arXiv Detail & Related papers (2023-03-15T20:33:50Z) - Privileged Graph Distillation for Cold Start Recommendation [57.918041397089254]
The cold start problem in recommender systems requires recommending to new users (items) based on attributes without any historical interaction records.
We propose a privileged graph distillation model(PGD)
Our proposed model is generally applicable to different cold start scenarios with new user, new item, or new user-new item.
arXiv Detail & Related papers (2021-05-31T14:05:27Z) - Pre-training Graph Transformer with Multimodal Side Information for
Recommendation [82.4194024706817]
We propose a pre-training strategy to learn item representations by considering both item side information and their relationships.
We develop a novel sampling algorithm named MCNSampling to select contextual neighbors for each item.
The proposed Pre-trained Multimodal Graph Transformer (PMGT) learns item representations with two objectives: 1) graph structure reconstruction, and 2) masked node feature reconstruction.
arXiv Detail & Related papers (2020-10-23T10:30:24Z) - Understanding Road Layout from Videos as a Whole [82.30800791500869]
We formulate it as a top-view road attributes prediction problem and our goal is to predict these attributes for each frame both accurately and consistently.
We exploit the following three novel aspects: leveraging camera motions in videos, including context cuesand incorporating long-term video information.
arXiv Detail & Related papers (2020-07-02T00:59:15Z) - Comprehensive Information Integration Modeling Framework for Video
Titling [124.11296128308396]
We integrate comprehensive sources of information, including the content of consumer-generated videos, the narrative comment sentences supplied by consumers, and the product attributes, in an end-to-end modeling framework.
To tackle this issue, the proposed method consists of two processes, i.e., granular-level interaction modeling and abstraction-level story-line summarization.
We collect a large-scale dataset accordingly from real-world data in Taobao, a world-leading e-commerce platform.
arXiv Detail & Related papers (2020-06-24T10:38:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.