Enhancing Prediction Models with Reinforcement Learning
- URL: http://arxiv.org/abs/2412.06791v1
- Date: Thu, 21 Nov 2024 12:24:11 GMT
- Title: Enhancing Prediction Models with Reinforcement Learning
- Authors: Karol Radziszewski, Piotr Ociepka,
- Abstract summary: We present a large-scale news recommendation system implemented at Ringier Axel Springer Polska.
The system, named Aureus, integrates a variety of algorithms, including multi-armed bandit methods and deep learning models based on large language models.
- Score: 0.0
- License:
- Abstract: We present a large-scale news recommendation system implemented at Ringier Axel Springer Polska, focusing on enhancing prediction models with reinforcement learning techniques. The system, named Aureus, integrates a variety of algorithms, including multi-armed bandit methods and deep learning models based on large language models (LLMs). We detail the architecture and implementation of Aureus, emphasizing the significant improvements in online metrics achieved by combining ranking prediction models with reinforcement learning. The paper further explores the impact of different models mixing on key business performance indicators. Our approach effectively balances the need for personalized recommendations with the ability to adapt to rapidly changing news content, addressing common challenges such as the cold start problem and content freshness. The results of online evaluation demonstrate the effectiveness of the proposed system in a real-world production environment.
Related papers
- Online-BLS: An Accurate and Efficient Online Broad Learning System for Data Stream Classification [52.251569042852815]
We introduce an online broad learning system framework with closed-form solutions for each online update.
We design an effective weight estimation algorithm and an efficient online updating strategy.
Our framework is naturally extended to data stream scenarios with concept drift and exceeds state-of-the-art baselines.
arXiv Detail & Related papers (2025-01-28T13:21:59Z) - Boosting Alignment for Post-Unlearning Text-to-Image Generative Models [55.82190434534429]
Large-scale generative models have shown impressive image-generation capabilities, propelled by massive data.
This often inadvertently leads to the generation of harmful or inappropriate content and raises copyright concerns.
We propose a framework that seeks an optimal model update at each unlearning iteration, ensuring monotonic improvement on both objectives.
arXiv Detail & Related papers (2024-12-09T21:36:10Z) - A Collaborative Ensemble Framework for CTR Prediction [73.59868761656317]
We propose a novel framework, Collaborative Ensemble Training Network (CETNet), to leverage multiple distinct models.
Unlike naive model scaling, our approach emphasizes diversity and collaboration through collaborative learning.
We validate our framework on three public datasets and a large-scale industrial dataset from Meta.
arXiv Detail & Related papers (2024-11-20T20:38:56Z) - Analyzing Persuasive Strategies in Meme Texts: A Fusion of Language Models with Paraphrase Enrichment [0.23020018305241333]
This paper describes our approach to hierarchical multi-label detection of persuasion techniques in meme texts.
The scope of the study encompasses enhancing model performance through innovative training techniques and data augmentation strategies.
arXiv Detail & Related papers (2024-07-01T20:25:20Z) - Retrieval-based Knowledge Transfer: An Effective Approach for Extreme
Large Language Model Compression [64.07696663255155]
Large-scale pre-trained language models (LLMs) have demonstrated exceptional performance in various natural language processing (NLP) tasks.
However, the massive size of these models poses huge challenges for their deployment in real-world applications.
We introduce a novel compression paradigm called Retrieval-based Knowledge Transfer (RetriKT) which effectively transfers the knowledge of LLMs to extremely small-scale models.
arXiv Detail & Related papers (2023-10-24T07:58:20Z) - Reinforcement Learning for Topic Models [3.42658286826597]
We apply reinforcement learning techniques to topic modeling by replacing the variational autoencoder in ProdLDA with a continuous action space reinforcement learning policy.
We introduce several modifications: modernize the neural network architecture, weight the ELBO loss, use contextual embeddings, and monitor the learning process via computing topic diversity and coherence.
arXiv Detail & Related papers (2023-05-08T16:41:08Z) - Weighted Ensemble Self-Supervised Learning [67.24482854208783]
Ensembling has proven to be a powerful technique for boosting model performance.
We develop a framework that permits data-dependent weighted cross-entropy losses.
Our method outperforms both in multiple evaluation metrics on ImageNet-1K.
arXiv Detail & Related papers (2022-11-18T02:00:17Z) - Improving Sample Efficiency of Deep Learning Models in Electricity
Market [0.41998444721319217]
We propose a general framework, namely Knowledge-Augmented Training (KAT), to improve the sample efficiency.
We propose a novel data augmentation technique to generate some synthetic data, which are later processed by an improved training strategy.
Modern learning theories demonstrate the effectiveness of our method in terms of effective prediction error feedbacks, a reliable loss function, and rich gradient noises.
arXiv Detail & Related papers (2022-10-11T16:35:13Z) - Learning New Skills after Deployment: Improving open-domain
internet-driven dialogue with human feedback [22.92577324751342]
We study how to improve internet-driven conversational skills in a learning framework.
We collect deployment data, and collect various types of human feedback.
We find the recently introduced Director model shows significant improvements over other existing approaches.
arXiv Detail & Related papers (2022-08-05T16:41:46Z) - Hybrid Model with Time Modeling for Sequential Recommender Systems [0.15229257192293202]
Booking.com organized the WSDM WebTour 2021 Challenge, which aims to benchmark models to recommend the final city in a trip.
We conducted several experiments to test different state-of-the-art deep learning architectures for recommender systems.
Our experimental result shows that the improved NARM outperforms all other state-of-the-art benchmark methods.
arXiv Detail & Related papers (2021-03-07T19:28:22Z) - Enhancing Dialogue Generation via Multi-Level Contrastive Learning [57.005432249952406]
We propose a multi-level contrastive learning paradigm to model the fine-grained quality of the responses with respect to the query.
A Rank-aware (RC) network is designed to construct the multi-level contrastive optimization objectives.
We build a Knowledge Inference (KI) component to capture the keyword knowledge from the reference during training and exploit such information to encourage the generation of informative words.
arXiv Detail & Related papers (2020-09-19T02:41:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.