Short Video-based Advertisements Evaluation System: Self-Organizing
Learning Approach
- URL: http://arxiv.org/abs/2010.12662v1
- Date: Fri, 23 Oct 2020 20:52:24 GMT
- Title: Short Video-based Advertisements Evaluation System: Self-Organizing
Learning Approach
- Authors: Yunjie Zhang, Fei Tao, Xudong Liu, Runze Su, Xiaorong Mei, Weicong
Ding, Zhichen Zhao, Lei Yuan, Ji Liu
- Abstract summary: We propose a novel end-to-end self-organizing framework for user behavior prediction.
Our model is able to learn the optimal topology of neural network architecture, as well as optimal weights, through training data.
- Score: 22.2568038582329
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rising of short video apps, such as TikTok, Snapchat and Kwai,
advertisement in short-term user-generated videos (UGVs) has become a trending
form of advertising. Prediction of user behavior without specific user profile
is required by advertisers, as they expect to acquire advertisement performance
in advance in the scenario of cold start. Current recommender system do not
take raw videos as input; additionally, most previous work of Multi-Modal
Machine Learning may not deal with unconstrained videos like UGVs. In this
paper, we proposed a novel end-to-end self-organizing framework for user
behavior prediction. Our model is able to learn the optimal topology of neural
network architecture, as well as optimal weights, through training data. We
evaluate our proposed method on our in-house dataset. The experimental results
reveal that our model achieves the best performance in all our experiments.
Related papers
- Predicting Long-horizon Futures by Conditioning on Geometry and Time [49.86180975196375]
We explore the task of generating future sensor observations conditioned on the past.
We leverage the large-scale pretraining of image diffusion models which can handle multi-modality.
We create a benchmark for video prediction on a diverse set of videos spanning indoor and outdoor scenes.
arXiv Detail & Related papers (2024-04-17T16:56:31Z) - COURIER: Contrastive User Intention Reconstruction for Large-Scale Visual Recommendation [33.903096803803706]
We argue that a visual feature pre-training method tailored for recommendation is necessary for further improvements beyond existing modality features.
We propose an effective user intention reconstruction module to mine visual features related to user interests from behavior histories.
arXiv Detail & Related papers (2023-06-08T07:45:24Z) - REST: REtrieve & Self-Train for generative action recognition [54.90704746573636]
We propose to adapt a pre-trained generative Vision & Language (V&L) Foundation Model for video/action recognition.
We show that direct fine-tuning of a generative model to produce action classes suffers from severe overfitting.
We introduce REST, a training framework consisting of two key components.
arXiv Detail & Related papers (2022-09-29T17:57:01Z) - Revealing Single Frame Bias for Video-and-Language Learning [115.01000652123882]
We show that a single-frame trained model can achieve better performance than existing methods that use multiple frames for training.
This result reveals the existence of a strong "static appearance bias" in popular video-and-language datasets.
We propose two new retrieval tasks based on existing fine-grained action recognition datasets that encourage temporal modeling.
arXiv Detail & Related papers (2022-06-07T16:28:30Z) - Reinforcement Learning with Action-Free Pre-Training from Videos [95.25074614579646]
We introduce a framework that learns representations useful for understanding the dynamics via generative pre-training on videos.
Our framework significantly improves both final performances and sample-efficiency of vision-based reinforcement learning.
arXiv Detail & Related papers (2022-03-25T19:44:09Z) - CLUE: Contextualised Unified Explainable Learning of User Engagement in
Video Lectures [6.25256391074865]
We propose a new unified model, CLUE, which learns from the features extracted from public online teaching videos.
Our model exploits various multi-modal features to model the complexity of language, context information, textual emotion of the delivered content.
arXiv Detail & Related papers (2022-01-14T19:51:06Z) - Auxiliary Learning for Self-Supervised Video Representation via
Similarity-based Knowledge Distillation [2.6519061087638014]
We propose a novel approach to complement self-supervised pretraining via an auxiliary pretraining phase, based on knowledge similarity distillation, auxSKD.
Our method deploys a teacher network that iteratively distils its knowledge to the student model by capturing the similarity information between segments of unlabelled video data.
We also introduce a novel pretext task, Video Segment Pace Prediction or VSPP, which requires our model to predict the playback speed of a randomly selected segment of the input video to provide more reliable self-supervised representations.
arXiv Detail & Related papers (2021-12-07T21:50:40Z) - Click-Through Rate Prediction Using Graph Neural Networks and Online
Learning [0.0]
A small percent improvement on the CTR prediction accuracy has been mentioned to add millions of dollars of revenue to the advertisement industry.
This project is interested in building a CTR predictor using Graph Neural Networks and an online learning algorithm.
arXiv Detail & Related papers (2021-05-09T01:35:49Z) - Less is More: ClipBERT for Video-and-Language Learning via Sparse
Sampling [98.41300980759577]
A canonical approach to video-and-language learning dictates a neural model to learn from offline-extracted dense video features.
We propose a generic framework ClipBERT that enables affordable end-to-end learning for video-and-language tasks.
Experiments on text-to-video retrieval and video question answering on six datasets demonstrate that ClipBERT outperforms existing methods.
arXiv Detail & Related papers (2021-02-11T18:50:16Z) - Privileged Knowledge Distillation for Online Action Detection [114.5213840651675]
Online Action Detection (OAD) in videos is proposed as a per-frame labeling task to address the real-time prediction tasks.
This paper presents a novel learning-with-privileged based framework for online action detection where the future frames only observable at the training stages are considered as a form of privileged information.
arXiv Detail & Related papers (2020-11-18T08:52:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.