Related papers: DynamiX: Dynamic Resource eXploration for Personalized Ad-Recommendations

DynamiX: Dynamic Resource eXploration for Personalized Ad-Recommendations

URL: http://arxiv.org/abs/2511.18331v1
Date: Sun, 23 Nov 2025 08:10:33 GMT
Title: DynamiX: Dynamic Resource eXploration for Personalized Ad-Recommendations
Authors: Sohini Roychowdhury, Adam Holeman, Mohammad Amin, Feng Wei, Bhaskar Mehta, Srihari Reddy,
Abstract summary: We introduce Dynamix, a scalable, personalized sequence exploration framework for online ad-recommendation systems.<n>Dynamix categorizes users-engagements at session and surface-levels by leveraging correlations between dwell-times and ad-conversion events.<n>We show that Dynamix achieves significant cost efficiency and performance improvements in online user-sequence based recommendation models.
Score: 5.168870928194366
License: http://creativecommons.org/licenses/by/4.0/
Abstract: For online ad-recommendation systems, processing complete user-ad-engagement histories is both computationally intensive and noise-prone. We introduce Dynamix, a scalable, personalized sequence exploration framework that optimizes event history processing using maximum relevance principles and self-supervised learning through Event Based Features (EBFs). Dynamix categorizes users-engagements at session and surface-levels by leveraging correlations between dwell-times and ad-conversion events. This enables targeted, event-level feature removal and selective feature boosting for certain user-segments, thereby yielding training and inference efficiency wins without sacrificing engaging ad-prediction accuracy. While, dynamic resource removal increases training and inference throughput by 1.15% and 1.8%, respectively, dynamic feature boosting provides 0.033 NE gains while boosting inference QPS by 4.2% over baseline models. These results demonstrate that Dynamix achieves significant cost efficiency and performance improvements in online user-sequence based recommendation models. Self-supervised user-segmentation and resource exploration can further boost complex feature selection strategies while optimizing for workflow and compute resources.

Related papers

AceGRPO: Adaptive Curriculum Enhanced Group Relative Policy Optimization for Autonomous Machine Learning Engineering [52.67783579040657]
AceGRPO is a machine learning system that prioritizes tasks at the agent's learning frontier to maximize learning efficiency.<n>Our trained Ace-30B model achieves a 100% valid submission rate on MLE-Bench-Lite, approaches the performance of proprietary frontier models, and outperforms larger open-source baselines.
arXiv Detail & Related papers (2026-02-08T10:55:03Z)
Bridging VLMs and Embodied Intelligence with Deliberate Practice Policy Optimization [72.20212909644017]
Deliberate Practice Policy Optimization (DPPO) is a metacognitive Metaloop'' training framework.<n>DPPO alternates between supervised fine-tuning (competence expansion) and reinforcement learning (skill refinement)<n> Empirically, training a vision-language embodied model with DPPO, referred to as Pelican-VL 1.0, yields a 20.3% performance improvement over the base model.<n>We are open-sourcing both the models and code, providing the first systematic framework that alleviates the data and resource bottleneck.
arXiv Detail & Related papers (2025-11-20T17:58:04Z)
Deep Reinforcement Learning for Ranking Utility Tuning in the Ad Recommender System at Pinterest [10.816672840498079]
The ranking utility function in an ad recommender system plays a central role in balancing values across the platform, advertisers, and users.<n>Traditional manual tuning, while offering simplicity and interpretability, often yields suboptimal results.<n>We propose a general Deep Reinforcement Learning framework for personalized utility tuning.
arXiv Detail & Related papers (2025-09-05T17:57:45Z)
Similarity-Based Supervised User Session Segmentation Method for Behavior Logs [0.6524460254566904]
We propose a supervised session segmentation method based on similarity features derived from action embeddings and attributes.<n>We construct a manually annotated dataset from real browsing histories and evaluate the segmentation performance using F1-score, PR-AUC, and ROC-AUC.
arXiv Detail & Related papers (2025-08-22T05:47:42Z)
Enhancing Serendipity Recommendation System by Constructing Dynamic User Knowledge Graphs with Large Language Models [0.9262403397108375]
Large language models (LLMs) have demonstrated potential serendipity in recommendation, thanks to their extensive world knowledge and superior reasoning capabilities.<n>We propose a method that leverages llm to dynamically construct user knowledge graphs, thereby enhancing the serendipity of recommendation systems.
arXiv Detail & Related papers (2025-08-06T02:52:09Z)
On the Role of Feedback in Test-Time Scaling of Agentic AI Workflows [71.92083784393418]
Agentic AI (systems that autonomously plan and act) are becoming widespread, yet their task success rate on complex tasks remains low.<n>Inference-time alignment relies on three components: sampling, evaluation, and feedback.<n>We introduce Iterative Agent Decoding (IAD), a procedure that repeatedly inserts feedback extracted from different forms of critiques.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal [55.13854171147104]
Large Language Models (LLMs) have revolutionized various domains, including natural language processing, data analysis, and software development.<n>We present Dynamic Action Re-Sampling (DARS), a novel inference time compute scaling approach for coding agents.<n>We evaluate our approach on SWE-Bench Lite benchmark, demonstrating that this scaling strategy achieves a pass@k score of 55% with Claude 3.5 Sonnet V2.
arXiv Detail & Related papers (2025-03-18T14:02:59Z)
PRISM: Self-Pruning Intrinsic Selection Method for Training-Free Multimodal Data Selection [68.8373788348678]
Visual instruction tuning adapts pre-trained Multimodal Large Language Models to follow human instructions.<n>PRISM is the first training-free framework for efficient visual instruction selection.<n>It reduces the end-to-end time for data selection and model tuning to just 30% of conventional pipelines.
arXiv Detail & Related papers (2025-02-17T18:43:41Z)
Switchable Decision: Dynamic Neural Generation Networks [98.61113699324429]
We propose a switchable decision to accelerate inference by dynamically assigning resources for each data instance. Our method benefits from less cost during inference while keeping the same accuracy.
arXiv Detail & Related papers (2024-05-07T17:44:54Z)
Augmenting Unsupervised Reinforcement Learning with Self-Reference [63.68018737038331]
Humans possess the ability to draw on past experiences explicitly when learning new tasks. We propose the Self-Reference (SR) approach, an add-on module explicitly designed to leverage historical information. Our approach achieves state-of-the-art results in terms of Interquartile Mean (IQM) performance and Optimality Gap reduction on the Unsupervised Reinforcement Learning Benchmark.
arXiv Detail & Related papers (2023-11-16T09:07:34Z)
Automatic tuning of hyper-parameters of reinforcement learning algorithms using Bayesian optimization with behavioral cloning [0.0]
In reinforcement learning (RL), the information content of data gathered by the learning agent is dependent on the setting of many hyper- parameters. In this work, a novel approach for autonomous hyper- parameter setting using Bayesian optimization is proposed. Experiments reveal promising results compared to other manual tweaking and optimization-based approaches.
arXiv Detail & Related papers (2021-12-15T13:10:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.