Related papers: Short Video-based Advertisements Evaluation System: Self-Organizing Learning Approach

Short Video-based Advertisements Evaluation System: Self-Organizing Learning Approach

URL: http://arxiv.org/abs/2010.12662v1
Date: Fri, 23 Oct 2020 20:52:24 GMT
Title: Short Video-based Advertisements Evaluation System: Self-Organizing Learning Approach
Authors: Yunjie Zhang, Fei Tao, Xudong Liu, Runze Su, Xiaorong Mei, Weicong Ding, Zhichen Zhao, Lei Yuan, Ji Liu
Abstract summary: We propose a novel end-to-end self-organizing framework for user behavior prediction. Our model is able to learn the optimal topology of neural network architecture, as well as optimal weights, through training data.
Score: 22.2568038582329
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the rising of short video apps, such as TikTok, Snapchat and Kwai, advertisement in short-term user-generated videos (UGVs) has become a trending form of advertising. Prediction of user behavior without specific user profile is required by advertisers, as they expect to acquire advertisement performance in advance in the scenario of cold start. Current recommender system do not take raw videos as input; additionally, most previous work of Multi-Modal Machine Learning may not deal with unconstrained videos like UGVs. In this paper, we proposed a novel end-to-end self-organizing framework for user behavior prediction. Our model is able to learn the optimal topology of neural network architecture, as well as optimal weights, through training data. We evaluate our proposed method on our in-house dataset. The experimental results reveal that our model achieves the best performance in all our experiments.

Related papers

Pre-training for Recommendation Unlearning [14.514770044236375]
UnlearnRec is a model-agnostic pre-training paradigm that prepares systems for efficient unlearning operations.<n>Our method delivers exceptional unlearning effectiveness while providing more than 10x speedup compared to retraining approaches.
arXiv Detail & Related papers (2025-05-28T17:57:11Z)
Learning from Streaming Video with Orthogonal Gradients [62.51504086522027]
We address the challenge of representation learning from a continuous stream of video as input, in a self-supervised manner. This differs from the standard approaches to video learning where videos are chopped and shuffled during training in order to create a non-redundant batch. We demonstrate the drop in performance when moving from shuffled to sequential learning on three tasks.
arXiv Detail & Related papers (2025-04-02T17:59:57Z)
Predicting Long-horizon Futures by Conditioning on Geometry and Time [49.86180975196375]
We explore the task of generating future sensor observations conditioned on the past. We leverage the large-scale pretraining of image diffusion models which can handle multi-modality. We create a benchmark for video prediction on a diverse set of videos spanning indoor and outdoor scenes.
arXiv Detail & Related papers (2024-04-17T16:56:31Z)
COURIER: Contrastive User Intention Reconstruction for Large-Scale Visual Recommendation [33.903096803803706]
We argue that a visual feature pre-training method tailored for recommendation is necessary for further improvements beyond existing modality features. We propose an effective user intention reconstruction module to mine visual features related to user interests from behavior histories.
arXiv Detail & Related papers (2023-06-08T07:45:24Z)
REST: REtrieve & Self-Train for generative action recognition [54.90704746573636]
We propose to adapt a pre-trained generative Vision & Language (V&L) Foundation Model for video/action recognition. We show that direct fine-tuning of a generative model to produce action classes suffers from severe overfitting. We introduce REST, a training framework consisting of two key components.
arXiv Detail & Related papers (2022-09-29T17:57:01Z)
Revealing Single Frame Bias for Video-and-Language Learning [115.01000652123882]
We show that a single-frame trained model can achieve better performance than existing methods that use multiple frames for training. This result reveals the existence of a strong "static appearance bias" in popular video-and-language datasets. We propose two new retrieval tasks based on existing fine-grained action recognition datasets that encourage temporal modeling.
arXiv Detail & Related papers (2022-06-07T16:28:30Z)
Reinforcement Learning with Action-Free Pre-Training from Videos [95.25074614579646]
We introduce a framework that learns representations useful for understanding the dynamics via generative pre-training on videos. Our framework significantly improves both final performances and sample-efficiency of vision-based reinforcement learning.
arXiv Detail & Related papers (2022-03-25T19:44:09Z)
CLUE: Contextualised Unified Explainable Learning of User Engagement in Video Lectures [6.25256391074865]
We propose a new unified model, CLUE, which learns from the features extracted from public online teaching videos. Our model exploits various multi-modal features to model the complexity of language, context information, textual emotion of the delivered content.
arXiv Detail & Related papers (2022-01-14T19:51:06Z)
Auxiliary Learning for Self-Supervised Video Representation via Similarity-based Knowledge Distillation [2.6519061087638014]
We propose a novel approach to complement self-supervised pretraining via an auxiliary pretraining phase, based on knowledge similarity distillation, auxSKD. Our method deploys a teacher network that iteratively distils its knowledge to the student model by capturing the similarity information between segments of unlabelled video data. We also introduce a novel pretext task, Video Segment Pace Prediction or VSPP, which requires our model to predict the playback speed of a randomly selected segment of the input video to provide more reliable self-supervised representations.
arXiv Detail & Related papers (2021-12-07T21:50:40Z)
Click-Through Rate Prediction Using Graph Neural Networks and Online Learning [0.0]
A small percent improvement on the CTR prediction accuracy has been mentioned to add millions of dollars of revenue to the advertisement industry. This project is interested in building a CTR predictor using Graph Neural Networks and an online learning algorithm.
arXiv Detail & Related papers (2021-05-09T01:35:49Z)
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling [98.41300980759577]
A canonical approach to video-and-language learning dictates a neural model to learn from offline-extracted dense video features. We propose a generic framework ClipBERT that enables affordable end-to-end learning for video-and-language tasks. Experiments on text-to-video retrieval and video question answering on six datasets demonstrate that ClipBERT outperforms existing methods.
arXiv Detail & Related papers (2021-02-11T18:50:16Z)
Privileged Knowledge Distillation for Online Action Detection [114.5213840651675]
Online Action Detection (OAD) in videos is proposed as a per-frame labeling task to address the real-time prediction tasks. This paper presents a novel learning-with-privileged based framework for online action detection where the future frames only observable at the training stages are considered as a form of privileged information.
arXiv Detail & Related papers (2020-11-18T08:52:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.