Counterfactual Evaluation of Ads Ranking Models through Domain Adaptation
- URL: http://arxiv.org/abs/2409.19824v1
- Date: Sun, 29 Sep 2024 23:12:04 GMT
- Title: Counterfactual Evaluation of Ads Ranking Models through Domain Adaptation
- Authors: Mohamed A. Radwan, Himaghna Bhattacharjee, Quinn Lanners, Jiasheng Zhang, Serkan Karakulak, Houssam Nassif, Murat Ali Bayir,
- Abstract summary: This approach measures reward for ranking model changes in large-scale Ads recommender systems.
Our experiments demonstrate that the proposed technique outperforms both the vanilla IPS method and approaches using non-generalized reward models.
- Score: 4.488611783089895
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a domain-adapted reward model that works alongside an Offline A/B testing system for evaluating ranking models. This approach effectively measures reward for ranking model changes in large-scale Ads recommender systems, where model-free methods like IPS are not feasible. Our experiments demonstrate that the proposed technique outperforms both the vanilla IPS method and approaches using non-generalized reward models.
Related papers
- Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models [54.132297393662654]
We introduce a hybrid method that fine-tunes cutting-edge diffusion models by optimizing reward models through RL.
We demonstrate the capability of our approach to outperform the best designs in offline data, leveraging the extrapolation capabilities of reward models.
arXiv Detail & Related papers (2024-05-30T03:57:29Z) - Towards Evaluating Transfer-based Attacks Systematically, Practically,
and Fairly [79.07074710460012]
adversarial vulnerability of deep neural networks (DNNs) has drawn great attention.
An increasing number of transfer-based methods have been developed to fool black-box DNN models.
We establish a transfer-based attack benchmark (TA-Bench) which implements 30+ methods.
arXiv Detail & Related papers (2023-11-02T15:35:58Z) - Reject option models comprising out-of-distribution detection [6.746400031322727]
The optimal prediction strategy for out-of-distribution setups is a fundamental question in machine learning.
We propose three reject option models for OOD setups.
We establish that all the proposed models, despite their different formulations, share a common class of optimal strategies.
arXiv Detail & Related papers (2023-07-11T12:09:14Z) - Exploring validation metrics for offline model-based optimisation with
diffusion models [50.404829846182764]
In model-based optimisation (MBO) we are interested in using machine learning to design candidates that maximise some measure of reward with respect to a black box function called the (ground truth) oracle.
While an approximation to the ground oracle can be trained and used in place of it during model validation to measure the mean reward over generated candidates, the evaluation is approximate and vulnerable to adversarial examples.
This is encapsulated under our proposed evaluation framework which is also designed to measure extrapolation.
arXiv Detail & Related papers (2022-11-19T16:57:37Z) - Off-policy evaluation for learning-to-rank via interpolating the
item-position model and the position-based model [83.83064559894989]
A critical need for industrial recommender systems is the ability to evaluate recommendation policies offline, before deploying them to production.
We develop a new estimator that mitigates the problems of the two most popular off-policy estimators for rankings.
In particular, the new estimator, called INTERPOL, addresses the bias of a potentially misspecified position-based model.
arXiv Detail & Related papers (2022-10-15T17:22:30Z) - A Recommendation Approach based on Similarity-Popularity Models of
Complex Networks [1.385805101975528]
This work proposes a novel recommendation method based on complex networks generated by a similarity-popularity model to predict ones.
We first construct a model of a network having users and items as nodes from observed ratings and then use it to predict unseen ratings.
The proposed approach is implemented and experimentally compared against baseline and state-of-the-art recommendation methods on 21 datasets from various domains.
arXiv Detail & Related papers (2022-09-29T11:00:06Z) - On the model-based stochastic value gradient for continuous
reinforcement learning [50.085645237597056]
We show that simple model-based agents can outperform state-of-the-art model-free agents in terms of both sample-efficiency and final reward.
Our findings suggest that model-based policy evaluation deserves closer attention.
arXiv Detail & Related papers (2020-08-28T17:58:29Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z) - A Zero-Shot based Fingerprint Presentation Attack Detection System [8.676298469169174]
We propose a novel Zero-Shot Presentation Attack Detection Model to guarantee the generalization of the PAD model.
The proposed ZSPAD-Model based on generative model does not utilize any negative samples in the process of establishment.
In order to improve the performance of the proposed model, 9 confidence scores are discussed in this article.
arXiv Detail & Related papers (2020-02-12T10:52:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.