Adversarial Gradient Driven Exploration for Deep Click-Through Rate
Prediction
- URL: http://arxiv.org/abs/2112.11136v1
- Date: Tue, 21 Dec 2021 12:13:07 GMT
- Title: Adversarial Gradient Driven Exploration for Deep Click-Through Rate
Prediction
- Authors: Kailun Wu, Weijie Bian, Zhangming Chan, Lejian Ren, Shiming Xiang,
Shuguang Han, Hongbo Deng, Bo Zheng
- Abstract summary: We propose a novel exploration method called textbfAdrial textbfGradientversa Driven textbfExploration (AGE)
AGE simulates the gradient updating process, which can approximate the influence of the samples of to-be-explored items for the model.
The effectiveness of our approach was demonstrated on an open-access academic dataset.
- Score: 39.61776002290324
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nowadays, data-driven deep neural models have already shown remarkable
progress on Click-through Rate (CTR) prediction. Unfortunately, the
effectiveness of such models may fail when there are insufficient data. To
handle this issue, researchers often adopt exploration strategies to examine
items based on the estimated reward, e.g., UCB or Thompson Sampling. In the
context of Exploitation-and-Exploration for CTR prediction, recent studies have
attempted to utilize the prediction uncertainty along with model prediction as
the reward score. However, we argue that such an approach may make the final
ranking score deviate from the original distribution, and thereby affect model
performance in the online system. In this paper, we propose a novel exploration
method called \textbf{A}dversarial \textbf{G}radient Driven
\textbf{E}xploration (AGE). Specifically, we propose a Pseudo-Exploration
Module to simulate the gradient updating process, which can approximate the
influence of the samples of to-be-explored items for the model. In addition,
for better exploration efficiency, we propose an Dynamic Threshold Unit to
eliminate the effects of those samples with low potential CTR. The
effectiveness of our approach was demonstrated on an open-access academic
dataset. Meanwhile, AGE has also been deployed in a real-world display
advertising platform and all online metrics have been significantly improved.
Related papers
- Learning Augmentation Policies from A Model Zoo for Time Series Forecasting [58.66211334969299]
We introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning.
By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance.
arXiv Detail & Related papers (2024-09-10T07:34:19Z) - Distribution Learning for Molecular Regression [10.96062816455682]
Distributional Mixture of Experts (DMoE) is a model-independent, and data-independent method for regression.
We evaluate the performance of DMoE on different molecular property prediction datasets.
arXiv Detail & Related papers (2024-07-30T00:21:51Z) - Learning Off-policy with Model-based Intrinsic Motivation For Active Online Exploration [15.463313629574111]
This paper investigates how to achieve sample-efficient exploration in continuous control tasks.
We introduce an RL algorithm that incorporates a predictive model and off-policy learning elements.
We derive an intrinsic reward without incurring parameters overhead.
arXiv Detail & Related papers (2024-03-31T11:39:11Z) - Conditional Unscented Autoencoders for Trajectory Prediction [13.121738145903532]
The CVAE is one of the most widely-used models in trajectory prediction for AD.
We leverage recent advances in the space of the VAE, the foundation of the CVAE, which show that a simple change in the sampling procedure can greatly benefit performance.
We show wide applicability of our models by evaluating them on the INTERACTION prediction dataset, outperforming the state of the art, as well as at the task of image modeling on the CelebA dataset.
arXiv Detail & Related papers (2023-10-30T18:59:32Z) - A positive feedback method based on F-measure value for Salient Object
Detection [1.9249287163937976]
This paper proposes a positive feedback method based on F-measure value for salient object detection (SOD)
Our proposed method takes an image to be detected and inputs it into several existing models to obtain their respective prediction maps.
Experimental results on five publicly available datasets show that our proposed positive feedback method outperforms the latest 12 methods in five evaluation metrics for saliency map prediction.
arXiv Detail & Related papers (2023-04-28T04:05:13Z) - Generative Causal Representation Learning for Out-of-Distribution Motion
Forecasting [13.99348653165494]
We propose Generative Causal Learning Representation to facilitate knowledge transfer under distribution shifts.
While we evaluate the effectiveness of our proposed method in human trajectory prediction models, GCRL can be applied to other domains as well.
arXiv Detail & Related papers (2023-02-17T00:30:44Z) - When Demonstrations Meet Generative World Models: A Maximum Likelihood
Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent.
Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z) - A Graph-Enhanced Click Model for Web Search [67.27218481132185]
We propose a novel graph-enhanced click model (GraphCM) for web search.
We exploit both intra-session and inter-session information for the sparsity and cold-start problems.
arXiv Detail & Related papers (2022-06-17T08:32:43Z) - Incorporating Causal Graphical Prior Knowledge into Predictive Modeling
via Simple Data Augmentation [92.96204497841032]
Causal graphs (CGs) are compact representations of the knowledge of the data generating processes behind the data distributions.
We propose a model-agnostic data augmentation method that allows us to exploit the prior knowledge of the conditional independence (CI) relations.
We experimentally show that the proposed method is effective in improving the prediction accuracy, especially in the small-data regime.
arXiv Detail & Related papers (2021-02-27T06:13:59Z) - Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual
Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms.
We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance.
We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.