Deep Reinforcement Learning-Based Product Recommender for Online
Advertising
- URL: http://arxiv.org/abs/2102.00333v1
- Date: Sat, 30 Jan 2021 23:05:04 GMT
- Title: Deep Reinforcement Learning-Based Product Recommender for Online
Advertising
- Authors: Milad Vaali Esfahaani, Yanbo Xue, and Peyman Setoodeh
- Abstract summary: This paper compares value-based and policy-based deep RL algorithms for designing recommender systems for online advertising.
The designed recommender systems aim at maximising the click-through rate (CTR) for the recommended items.
- Score: 1.7778609937758327
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In online advertising, recommender systems try to propose items from a list
of products to potential customers according to their interests. Such systems
have been increasingly deployed in E-commerce due to the rapid growth of
information technology and availability of large datasets. The ever-increasing
progress in the field of artificial intelligence has provided powerful tools
for dealing with such real-life problems. Deep reinforcement learning (RL) that
deploys deep neural networks as universal function approximators can be viewed
as a valid approach for design and implementation of recommender systems. This
paper provides a comparative study between value-based and policy-based deep RL
algorithms for designing recommender systems for online advertising. The
RecoGym environment is adopted for training these RL-based recommender systems,
where the long short term memory (LSTM) is deployed to build value and policy
networks in these two approaches, respectively. LSTM is used to take account of
the key role that order plays in the sequence of item observations by users.
The designed recommender systems aim at maximising the click-through rate (CTR)
for the recommended items. Finally, guidelines are provided for choosing proper
RL algorithms for different scenarios that the recommender system is expected
to handle.
Related papers
- Large Language Model Empowered Embedding Generator for Sequential Recommendation [57.49045064294086]
Large Language Model (LLM) has the potential to understand the semantic connections between items, regardless of their popularity.
We present LLMEmb, an innovative technique that harnesses LLM to create item embeddings that bolster the performance of Sequential Recommender Systems.
arXiv Detail & Related papers (2024-09-30T03:59:06Z) - LANE: Logic Alignment of Non-tuning Large Language Models and Online Recommendation Systems for Explainable Reason Generation [15.972926854420619]
Leveraging large language models (LLMs) offers new opportunities for comprehensive recommendation logic generation.
Fine-tuning LLM models for recommendation tasks incurs high computational costs and alignment issues with existing systems.
In this work, our proposed effective strategy LANE aligns LLMs with online recommendation systems without additional LLMs tuning.
arXiv Detail & Related papers (2024-07-03T06:20:31Z) - EASRec: Elastic Architecture Search for Efficient Long-term Sequential
Recommender Systems [82.76483989905961]
Current Sequential Recommender Systems (SRSs) suffer from computational and resource inefficiencies.
We develop the Elastic Architecture Search for Efficient Long-term Sequential Recommender Systems (EASRec)
EASRec introduces data-aware gates that leverage historical information from input data batch to improve the performance of the recommendation network.
arXiv Detail & Related papers (2024-02-01T07:22:52Z) - Recommender Systems in the Era of Large Language Models (LLMs) [62.0129013439038]
Large Language Models (LLMs) have revolutionized the fields of Natural Language Processing (NLP) and Artificial Intelligence (AI)
We conduct a comprehensive review of LLM-empowered recommender systems from various aspects including Pre-training, Fine-tuning, and Prompting.
arXiv Detail & Related papers (2023-07-05T06:03:40Z) - A Survey on Large Language Models for Recommendation [77.91673633328148]
Large Language Models (LLMs) have emerged as powerful tools in the field of Natural Language Processing (NLP)
This survey presents a taxonomy that categorizes these models into two major paradigms, respectively Discriminative LLM for Recommendation (DLLM4Rec) and Generative LLM for Recommendation (GLLM4Rec)
arXiv Detail & Related papers (2023-05-31T13:51:26Z) - Towards High-Order Complementary Recommendation via Logical Reasoning
Network [19.232457960085625]
We propose a logical reasoning network, LOGIREC, to learn embeddings of products.
LOGIREC is capable of capturing the asymmetric complementary relationship between products.
We also propose a hybrid network that is jointly optimized for learning a more generic product representation.
arXiv Detail & Related papers (2022-12-09T16:27:03Z) - Improving Long-Term Metrics in Recommendation Systems using
Short-Horizon Offline RL [56.20835219296896]
We study session-based recommendation scenarios where we want to recommend items to users during sequential interactions to improve their long-term utility.
We develop a new batch RL algorithm called Short Horizon Policy Improvement (SHPI) that approximates policy-induced distribution shifts across sessions.
arXiv Detail & Related papers (2021-06-01T15:58:05Z) - Self-Supervised Reinforcement Learning for Recommender Systems [77.38665506495553]
We propose self-supervised reinforcement learning for sequential recommendation tasks.
Our approach augments standard recommendation models with two output layers: one for self-supervised learning and the other for RL.
Based on such an approach, we propose two frameworks namely Self-Supervised Q-learning(SQN) and Self-Supervised Actor-Critic(SAC)
arXiv Detail & Related papers (2020-06-10T11:18:57Z) - Stacked Auto Encoder Based Deep Reinforcement Learning for Online
Resource Scheduling in Large-Scale MEC Networks [44.40722828581203]
An online resource scheduling framework is proposed for minimizing the sum of weighted task latency for all the Internet of things (IoT) users.
A deep reinforcement learning (DRL) based solution is proposed, which includes the following components.
A preserved and prioritized experience replay (2p-ER) is introduced to assist the DRL to train the policy network and find the optimal offloading policy.
arXiv Detail & Related papers (2020-01-24T23:01:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.