Deep Reinforcement Learning-Based Product Recommender for Online
Advertising
- URL: http://arxiv.org/abs/2102.00333v1
- Date: Sat, 30 Jan 2021 23:05:04 GMT
- Title: Deep Reinforcement Learning-Based Product Recommender for Online
Advertising
- Authors: Milad Vaali Esfahaani, Yanbo Xue, and Peyman Setoodeh
- Abstract summary: This paper compares value-based and policy-based deep RL algorithms for designing recommender systems for online advertising.
The designed recommender systems aim at maximising the click-through rate (CTR) for the recommended items.
- Score: 1.7778609937758327
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In online advertising, recommender systems try to propose items from a list
of products to potential customers according to their interests. Such systems
have been increasingly deployed in E-commerce due to the rapid growth of
information technology and availability of large datasets. The ever-increasing
progress in the field of artificial intelligence has provided powerful tools
for dealing with such real-life problems. Deep reinforcement learning (RL) that
deploys deep neural networks as universal function approximators can be viewed
as a valid approach for design and implementation of recommender systems. This
paper provides a comparative study between value-based and policy-based deep RL
algorithms for designing recommender systems for online advertising. The
RecoGym environment is adopted for training these RL-based recommender systems,
where the long short term memory (LSTM) is deployed to build value and policy
networks in these two approaches, respectively. LSTM is used to take account of
the key role that order plays in the sequence of item observations by users.
The designed recommender systems aim at maximising the click-through rate (CTR)
for the recommended items. Finally, guidelines are provided for choosing proper
RL algorithms for different scenarios that the recommender system is expected
to handle.
Related papers
- Large Language Model driven Policy Exploration for Recommender Systems [50.70228564385797]
offline RL policies trained on static user data are vulnerable to distribution shift when deployed in dynamic online environments.
Online RL-based RS also face challenges in production deployment due to the risks of exposing users to untrained or unstable policies.
Large Language Models (LLMs) offer a promising solution to mimic user objectives and preferences for pre-training policies offline.
We propose an Interaction-Augmented Learned Policy (iALP) that utilizes user preferences distilled from an LLM.
arXiv Detail & Related papers (2025-01-23T16:37:44Z) - Efficient and Deployable Knowledge Infusion for Open-World Recommendations via Large Language Models [53.547190001324665]
We propose REKI to acquire two types of external knowledge about users and items from large language models (LLMs)
We develop individual knowledge extraction and collective knowledge extraction tailored for different scales of scenarios, effectively reducing offline resource consumption.
Experiments demonstrate that REKI outperforms state-of-the-art baselines and is compatible with lots of recommendation algorithms and tasks.
arXiv Detail & Related papers (2024-08-20T03:45:24Z) - LANE: Logic Alignment of Non-tuning Large Language Models and Online Recommendation Systems for Explainable Reason Generation [15.972926854420619]
Leveraging large language models (LLMs) offers new opportunities for comprehensive recommendation logic generation.
Fine-tuning LLM models for recommendation tasks incurs high computational costs and alignment issues with existing systems.
In this work, our proposed effective strategy LANE aligns LLMs with online recommendation systems without additional LLMs tuning.
arXiv Detail & Related papers (2024-07-03T06:20:31Z) - SUBER: An RL Environment with Simulated Human Behavior for Recommender Systems [18.716102193517315]
Reinforcement learning (RL) has gained popularity in the realm of recommender systems.
This work introduces a modular and novel framework to train RL-based recommender systems.
The software, including the RL environment, is publicly available on GitHub.
arXiv Detail & Related papers (2024-06-01T11:56:08Z) - Recommender Systems in the Era of Large Language Models (LLMs) [62.0129013439038]
Large Language Models (LLMs) have revolutionized the fields of Natural Language Processing (NLP) and Artificial Intelligence (AI)
We conduct a comprehensive review of LLM-empowered recommender systems from various aspects including Pre-training, Fine-tuning, and Prompting.
arXiv Detail & Related papers (2023-07-05T06:03:40Z) - A Survey on Large Language Models for Recommendation [77.91673633328148]
Large Language Models (LLMs) have emerged as powerful tools in the field of Natural Language Processing (NLP)
This survey presents a taxonomy that categorizes these models into two major paradigms, respectively Discriminative LLM for Recommendation (DLLM4Rec) and Generative LLM for Recommendation (GLLM4Rec)
arXiv Detail & Related papers (2023-05-31T13:51:26Z) - Towards High-Order Complementary Recommendation via Logical Reasoning
Network [19.232457960085625]
We propose a logical reasoning network, LOGIREC, to learn embeddings of products.
LOGIREC is capable of capturing the asymmetric complementary relationship between products.
We also propose a hybrid network that is jointly optimized for learning a more generic product representation.
arXiv Detail & Related papers (2022-12-09T16:27:03Z) - Improving Long-Term Metrics in Recommendation Systems using
Short-Horizon Offline RL [56.20835219296896]
We study session-based recommendation scenarios where we want to recommend items to users during sequential interactions to improve their long-term utility.
We develop a new batch RL algorithm called Short Horizon Policy Improvement (SHPI) that approximates policy-induced distribution shifts across sessions.
arXiv Detail & Related papers (2021-06-01T15:58:05Z) - Self-Supervised Reinforcement Learning for Recommender Systems [77.38665506495553]
We propose self-supervised reinforcement learning for sequential recommendation tasks.
Our approach augments standard recommendation models with two output layers: one for self-supervised learning and the other for RL.
Based on such an approach, we propose two frameworks namely Self-Supervised Q-learning(SQN) and Self-Supervised Actor-Critic(SAC)
arXiv Detail & Related papers (2020-06-10T11:18:57Z) - Stacked Auto Encoder Based Deep Reinforcement Learning for Online
Resource Scheduling in Large-Scale MEC Networks [44.40722828581203]
An online resource scheduling framework is proposed for minimizing the sum of weighted task latency for all the Internet of things (IoT) users.
A deep reinforcement learning (DRL) based solution is proposed, which includes the following components.
A preserved and prioritized experience replay (2p-ER) is introduced to assist the DRL to train the policy network and find the optimal offloading policy.
arXiv Detail & Related papers (2020-01-24T23:01:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.