Beyond Relevance: Improving User Engagement by Personalization for Short-Video Search
- URL: http://arxiv.org/abs/2409.11281v1
- Date: Tue, 17 Sep 2024 15:37:51 GMT
- Title: Beyond Relevance: Improving User Engagement by Personalization for Short-Video Search
- Authors: Wentian Bao, Hu Liu, Kai Zheng, Chao Zhang, Shunyu Zhang, Enyun Yu, Wenwu Ou, Yang Song,
- Abstract summary: We introduce $textPR2$, a novel and comprehensive solution for personalizing short-video search.
Specifically, $textPR2$ leverages query-relevant collaborative filtering and personalized dense retrieval.
We have achieved the most remarkable user engagement improvements in recent years.
- Score: 16.491313774639007
- License:
- Abstract: Personalized search has been extensively studied in various applications, including web search, e-commerce, social networks, etc. With the soaring popularity of short-video platforms, exemplified by TikTok and Kuaishou, the question arises: can personalization elevate the realm of short-video search, and if so, which techniques hold the key? In this work, we introduce $\text{PR}^2$, a novel and comprehensive solution for personalizing short-video search, where $\text{PR}^2$ stands for the Personalized Retrieval and Ranking augmented search system. Specifically, $\text{PR}^2$ leverages query-relevant collaborative filtering and personalized dense retrieval to extract relevant and individually tailored content from a large-scale video corpus. Furthermore, it utilizes the QIN (Query-Dominate User Interest Network) ranking model, to effectively harness user long-term preferences and real-time behaviors, and efficiently learn from user various implicit feedback through a multi-task learning framework. By deploying the $\text{PR}^2$ in production system, we have achieved the most remarkable user engagement improvements in recent years: a 10.2% increase in CTR@10, a notable 20% surge in video watch time, and a 1.6% uplift of search DAU. We believe the practical insights presented in this work are valuable especially for building and improving personalized search systems for the short video platforms.
Related papers
- A Flexible and Scalable Framework for Video Moment Search [51.47907684209207]
This paper introduces a flexible framework for retrieving a ranked list of moments from collection of videos in any length to match a text query.
Our framework, called Segment-Proposal-Ranking (SPR), simplifies the search process into three independent stages: segment retrieval, proposal generation, and moment refinement with re-ranking.
Evaluations on the TVR-Ranking dataset demonstrate that our framework achieves state-of-the-art performance with significant reductions in computational cost and processing time.
arXiv Detail & Related papers (2025-01-09T08:54:19Z) - Query-centric Audio-Visual Cognition Network for Moment Retrieval, Segmentation and Step-Captioning [56.873534081386]
A new topic HIREST is presented, including video retrieval, moment retrieval, moment segmentation, and step-captioning.
We propose a query-centric audio-visual cognition network to construct a reliable multi-modal representation for three tasks.
This can cognize user-preferred content and thus attain a query-centric audio-visual representation for three tasks.
arXiv Detail & Related papers (2024-12-18T06:43:06Z) - Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark [5.76230561819199]
We propose Repurpose-10K, an extensive dataset comprising over 10,000 videos with more than 120,000 annotated clips.
We propose a two-stage solution to obtain annotations from real-world user-generated content.
We offer a baseline model to address this challenging task by integrating audio, visual, and caption aspects.
arXiv Detail & Related papers (2024-12-12T02:27:46Z) - Improving Video Corpus Moment Retrieval with Partial Relevance Enhancement [72.7576395034068]
Video Corpus Moment Retrieval (VCMR) is a new video retrieval task aimed at retrieving a relevant moment from a large corpus of untrimmed videos using a text query.
We argue that effectively capturing the partial relevance between the query and video is essential for the VCMR task.
For video retrieval, we introduce a multi-modal collaborative video retriever, generating different query representations for the two modalities.
For moment localization, we propose the focus-then-fuse moment localizer, utilizing modality-specific gates to capture essential content.
arXiv Detail & Related papers (2024-02-21T07:16:06Z) - Building an Open-Vocabulary Video CLIP Model with Better Architectures,
Optimization and Data [102.0069667710562]
This paper presents Open-VCLIP++, a framework that adapts CLIP to a strong zero-shot video classifier.
We demonstrate that training Open-VCLIP++ is tantamount to continual learning with zero historical data.
Our approach is evaluated on three widely used action recognition datasets.
arXiv Detail & Related papers (2023-10-08T04:46:43Z) - Interactive Video Corpus Moment Retrieval using Reinforcement Learning [35.38916770127218]
This paper tackles the problem by reinforcement learning, aiming to reach a search target within a few rounds of interaction by long-term learning from user feedbacks.
We conduct experiments for the challenging task of video corpus moment retrieval (VCMR) to localize moments from a large video corpus.
arXiv Detail & Related papers (2023-02-19T09:48:23Z) - Tree-based Text-Vision BERT for Video Search in Baidu Video Advertising [58.09698019028931]
How to pair the video ads with the user search is the core task of Baidu video advertising.
Due to the modality gap, the query-to-video retrieval is much more challenging than traditional query-to-document retrieval.
We present a tree-based combo-attention network (TCAN) which has been recently launched in Baidu's dynamic video advertising platform.
arXiv Detail & Related papers (2022-09-19T04:49:51Z) - CONQUER: Contextual Query-aware Ranking for Video Corpus Moment
Retrieval [24.649068267308913]
Video retrieval applications should enable users to retrieve a precise moment from a large video corpus.
We propose a novel model for effective moment localization and ranking.
We conduct studies on two datasets, TVR for closed-world TV episodes and DiDeMo for open-world user-generated videos.
arXiv Detail & Related papers (2021-09-21T08:07:27Z) - APES: Audiovisual Person Search in Untrimmed Video [87.4124877066541]
We present the Audiovisual Person Search dataset (APES)
APES contains over 1.9K identities labeled along 36 hours of video.
A key property of APES is that it includes dense temporal annotations that link faces to speech segments of the same identity.
arXiv Detail & Related papers (2021-06-03T08:16:42Z) - Adaptive Video Highlight Detection by Learning from User History [18.18119240674014]
We propose a framework that learns to adapt highlight detection to a user by exploiting the user's history in the form of highlights that the user has previously created.
Our framework consists of two sub-networks: a fully temporal convolutional highlight detection network $H$ that predicts highlight for an input video and a history encoder network $M$ for user history.
arXiv Detail & Related papers (2020-07-19T05:52:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.