Related papers: Learning List-wise Representation in Reinforcement Learning for Ads Allocation with Multiple Auxiliary Tasks

Learning List-wise Representation in Reinforcement Learning for Ads Allocation with Multiple Auxiliary Tasks

URL: http://arxiv.org/abs/2204.00888v1
Date: Sat, 2 Apr 2022 15:53:37 GMT
Title: Learning List-wise Representation in Reinforcement Learning for Ads Allocation with Multiple Auxiliary Tasks
Authors: Guogang Liao, Ze Wang, Xiaowen Shi, Xiaoxu Wu, Chuheng Zhang, Yongkang Wang, Xingxing Wang, Dong Wang
Abstract summary: We propose a novel algorithm to learn a better representation by leveraging task-specific signals on Meituan food delivery platform. Specifically, we propose three different types of auxiliary tasks that are based on reconstruction, prediction, and contrastive learning respectively.
Score: 14.9065245548275
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the recent prevalence of reinforcement learning (RL), there have been tremendous interests in utilizing RL for ads allocation in recommendation platforms (e.g., e-commerce and news feed sites). For better performance, recent RL-based ads allocation agent makes decisions based on representations of list-wise item arrangement. This results in a high-dimensional state-action space, which makes it difficult to learn an efficient and generalizable list-wise representation. To address this problem, we propose a novel algorithm to learn a better representation by leveraging task-specific signals on Meituan food delivery platform. Specifically, we propose three different types of auxiliary tasks that are based on reconstruction, prediction, and contrastive learning respectively. We conduct extensive offline experiments on the effectiveness of these auxiliary tasks and test our method on real-world food delivery platform. The experimental results show that our method can learn better list-wise representations and achieve higher revenue for the platform.

Related papers

Offline Multitask Representation Learning for Reinforcement Learning [86.26066704016056]
We study offline multitask representation learning in reinforcement learning (RL) We propose a new algorithm called MORL for offline multitask representation learning. Our theoretical results demonstrate the benefits of using the learned representation from the upstream offline task instead of directly learning the representation of the low-rank model.
arXiv Detail & Related papers (2024-03-18T08:50:30Z)
Accelerating exploration and representation learning with offline pre-training [52.6912479800592]
We show that exploration and representation learning can be improved by separately learning two different models from a single offline dataset. We show that learning a state representation using noise-contrastive estimation and a model of auxiliary reward can significantly improve the sample efficiency on the challenging NetHack benchmark.
arXiv Detail & Related papers (2023-03-31T18:03:30Z)
e-CLIP: Large-Scale Vision-Language Representation Learning in E-commerce [9.46186546774799]
We propose a contrastive learning framework that aligns language and visual models using unlabeled raw product text and images. We present techniques we used to train large-scale representation learning models and share solutions that address domain-specific challenges.
arXiv Detail & Related papers (2022-07-01T05:16:47Z)
Contrastive Learning as Goal-Conditioned Reinforcement Learning [147.28638631734486]
In reinforcement learning (RL), it is easier to solve a task if given a good representation. While deep RL should automatically acquire such good representations, prior work often finds that learning representations in an end-to-end fashion is unstable. We show (contrastive) representation learning methods can be cast as RL algorithms in their own right.
arXiv Detail & Related papers (2022-06-15T14:34:15Z)
Towards Universal Sequence Representation Learning for Recommender Systems [98.02154164251846]
We present a novel universal sequence representation learning approach, named UniSRec. The proposed approach utilizes the associated description text of items to learn transferable representations across different recommendation scenarios. Our approach can be effectively transferred to new recommendation domains or platforms in a parameter-efficient way.
arXiv Detail & Related papers (2022-06-13T07:21:56Z)
Contrastive Learning from Demonstrations [0.0]
We show that these representations are applicable for imitating several robotic tasks, including pick and place. We optimize a recently proposed self-supervised learning algorithm by applying contrastive learning to enhance task-relevant information.
arXiv Detail & Related papers (2022-01-30T13:36:07Z)
Learning Temporally-Consistent Representations for Data-Efficient Reinforcement Learning [3.308743964406687]
$k$-Step Latent (KSL) is a representation learning method that enforces temporal consistency of representations. KSL produces encoders that generalize better to new tasks unseen during training.
arXiv Detail & Related papers (2021-10-11T00:16:43Z)
Techniques Toward Optimizing Viewability in RTB Ad Campaigns Using Reinforcement Learning [0.0]
Reinforcement learning (RL) is an effective technique for training decision-making agents through interactions with their environment. In digital advertising, real-time bidding (RTB) is a common method of allocating advertising inventory through real-time auctions.
arXiv Detail & Related papers (2021-05-21T21:56:12Z)
Reinforcement Learning with Prototypical Representations [114.35801511501639]
Proto-RL is a self-supervised framework that ties representation learning with exploration through prototypical representations. These prototypes simultaneously serve as a summarization of the exploratory experience of an agent as well as a basis for representing observations. This enables state-of-the-art downstream policy learning on a set of difficult continuous control tasks.
arXiv Detail & Related papers (2021-02-22T18:56:34Z)
Self-supervised Learning for Large-scale Item Recommendations [18.19202958502061]
Large scale recommender models find most relevant items from huge catalogs. With millions to billions of items in the corpus, users tend to provide feedback for a very small set of them. We propose a multi-task self-supervised learning framework for large-scale item recommendations.
arXiv Detail & Related papers (2020-07-25T06:21:43Z)
Privileged Information Dropout in Reinforcement Learning [56.82218103971113]
Using privileged information during training can improve the sample efficiency and performance of machine learning systems. In this work, we investigate Privileged Information Dropout (pid) for achieving the latter which can be applied equally to value-based and policy-based reinforcement learning algorithms.
arXiv Detail & Related papers (2020-05-19T05:32:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.