Multi-objective Optimization of Notifications Using Offline
Reinforcement Learning
- URL: http://arxiv.org/abs/2207.03029v1
- Date: Thu, 7 Jul 2022 00:53:08 GMT
- Title: Multi-objective Optimization of Notifications Using Offline
Reinforcement Learning
- Authors: Prakruthi Prabhakar, Yiping Yuan, Guangyu Yang, Wensheng Sun, Ajith
Muralidharan
- Abstract summary: We formulate the near-real-time notification decision problem as a Markov Decision Process.
We propose an end-to-end offline reinforcement learning framework to optimize sequential notification decisions.
- Score: 1.2303635283131926
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mobile notification systems play a major role in a variety of applications to
communicate, send alerts and reminders to the users to inform them about news,
events or messages. In this paper, we formulate the near-real-time notification
decision problem as a Markov Decision Process where we optimize for multiple
objectives in the rewards. We propose an end-to-end offline reinforcement
learning framework to optimize sequential notification decisions. We address
the challenge of offline learning using a Double Deep Q-network method based on
Conservative Q-learning that mitigates the distributional shift problem and
Q-value overestimation. We illustrate our fully-deployed system and demonstrate
the performance and benefits of the proposed approach through both offline and
online experiments.
Related papers
- Improving Point-based Crowd Counting and Localization Based on Auxiliary Point Guidance [59.71186244597394]
We introduce an effective approach to stabilize the proposal-target matching in point-based methods.
We propose Auxiliary Point Guidance (APG) to provide clear and effective guidance for proposal selection and optimization.
We also develop Implicit Feature Interpolation (IFI) to enable adaptive feature extraction in diverse crowd scenarios.
arXiv Detail & Related papers (2024-05-17T07:23:27Z) - A Semantic-Aware Multiple Access Scheme for Distributed, Dynamic 6G-Based Applications [14.51946231794179]
This paper introduces a novel formulation for the problem of multiple access to the wireless spectrum.
It aims to optimize the utilization-fairness trade-off, using the $alpha$-fairness metric.
A Semantic-Aware Multi-Agent Double and Dueling Deep Q-Learning (SAMA-D3QL) technique is proposed.
arXiv Detail & Related papers (2024-01-12T00:32:38Z) - Query-Dependent Prompt Evaluation and Optimization with Offline Inverse
RL [62.824464372594576]
We aim to enhance arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization.
We identify a previously overlooked objective of query dependency in such optimization.
We introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data.
arXiv Detail & Related papers (2023-09-13T01:12:52Z) - Efficient Communication via Self-supervised Information Aggregation for
Online and Offline Multi-agent Reinforcement Learning [12.334522644561591]
We argue that efficient message aggregation is essential for good coordination in cooperative Multi-Agent Reinforcement Learning (MARL)
We propose Multi-Agent communication via Self-supervised Information Aggregation (MASIA), where agents can aggregate the received messages into compact representations with high relevance to augment the local policy.
We build offline benchmarks for multi-agent communication, which is the first as we know.
arXiv Detail & Related papers (2023-02-19T16:02:16Z) - Age of Semantics in Cooperative Communications: To Expedite Simulation
Towards Real via Offline Reinforcement Learning [53.18060442931179]
We propose the age of semantics (AoS) for measuring semantics freshness of status updates in a cooperative relay communication system.
We derive an online deep actor-critic (DAC) learning scheme under the on-policy temporal difference learning framework.
We then put forward a novel offline DAC scheme, which estimates the optimal control policy from a previously collected dataset.
arXiv Detail & Related papers (2022-09-19T11:55:28Z) - A State Transition Model for Mobile Notifications via Survival Analysis [10.638942431625381]
We propose a state transition framework to quantitatively evaluate the effectiveness of notifications.
We develop a survival model for badging notifications assuming a log-linear structure and a Weibull distribution.
Our results show that this model achieves more flexibility for applications and superior prediction accuracy than a logistic regression model.
arXiv Detail & Related papers (2022-07-07T05:38:39Z) - Large-Scale Sequential Learning for Recommender and Engineering Systems [91.3755431537592]
In this thesis, we focus on the design of an automatic algorithms that provide personalized ranking by adapting to the current conditions.
For the former, we propose novel algorithm called SAROS that take into account both kinds of feedback for learning over the sequence of interactions.
The proposed idea of taking into account the neighbour lines shows statistically significant results in comparison with the initial approach for faults detection in power grid.
arXiv Detail & Related papers (2022-05-13T21:09:41Z) - Offline Reinforcement Learning for Mobile Notifications [1.965345368500676]
Mobile notification systems have taken a major role in driving and maintaining user engagement for online platforms.
Most machine learning applications in notification systems are built around response-prediction models.
We argue that reinforcement learning is a better framework for notification systems in terms of performance and iteration speed.
arXiv Detail & Related papers (2022-02-04T22:22:22Z) - Cellular traffic offloading via Opportunistic Networking with
Reinforcement Learning [0.5758073912084364]
We propose an adaptive offloading solution based on the Reinforcement Learning framework.
We evaluate and compare the performance of two well-known learning algorithms: Actor-Critic and Q-Learning.
Our solution achieves a higher level of offloading with respect to other state-of-the-art approaches.
arXiv Detail & Related papers (2021-10-01T13:34:12Z) - A Deep Value-network Based Approach for Multi-Driver Order Dispatching [55.36656442934531]
We propose a deep reinforcement learning based solution for order dispatching.
We conduct large scale online A/B tests on DiDi's ride-dispatching platform.
Results show that CVNet consistently outperforms other recently proposed dispatching methods.
arXiv Detail & Related papers (2021-06-08T16:27:04Z) - Learning to Recover Reasoning Chains for Multi-Hop Question Answering
via Cooperative Games [66.98855910291292]
We propose a new problem of learning to recover reasoning chains from weakly supervised signals.
How the evidence passages are selected and how the selected passages are connected are handled by two models.
For evaluation, we created benchmarks based on two multi-hop QA datasets.
arXiv Detail & Related papers (2020-04-06T03:54:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.