Constraint-Aware Generative Re-ranking for Multi-Objective Optimization in Advertising Feeds
- URL: http://arxiv.org/abs/2603.04227v1
- Date: Wed, 04 Mar 2026 16:09:36 GMT
- Title: Constraint-Aware Generative Re-ranking for Multi-Objective Optimization in Advertising Feeds
- Authors: Chenfei Li, Hantao Zhao, Weixi Yao, Ruiming Huang, Rongrong Lu, Geng Tian, Dongying Kong,
- Abstract summary: We propose a constraint-aware generative reranking framework that transforms constrained optimization into bounded neural decoding.<n>Our framework unifies sequence generation and reward estimation into a single network.<n> Experiments on large-scale industrial feeds and online A/B tests show that our method improves revenue and user engagement while meeting strict latency requirements.
- Score: 3.791872390898376
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Optimizing reranking in advertising feeds is a constrained combinatorial problem, requiring simultaneous maximization of platform revenue and preservation of user experience. Recent generative ranking methods enable listwise optimization via autoregressive decoding, but their deployment is hindered by high inference latency and limited constraint handling. We propose a constraint-aware generative reranking framework that transforms constrained optimization into bounded neural decoding. Unlike prior approaches that separate generator and evaluator models, our framework unifies sequence generation and reward estimation into a single network. We further introduce constraint-aware reward pruning, integrating constraint satisfaction directly into decoding to efficiently generate optimal sequences. Experiments on large-scale industrial feeds and online A/B tests show that our method improves revenue and user engagement while meeting strict latency requirements, providing an efficient neural solution for constrained listwise optimization.
Related papers
- CSGO: Generalized Optimization for Cold Start in Wireless Collaborative Edge LLM Systems [62.24576366776727]
We propose a latency-aware scheduling framework to minimize total inference latency.<n>We show that the proposed method significantly reduces cold-start latency compared to baseline strategies.
arXiv Detail & Related papers (2025-08-15T07:49:22Z) - LLM-guided Chemical Process Optimization with a Multi-Agent Approach [8.714038047141202]
We present a multi-agent LLM framework that autonomously infers operating constraints from minimal process descriptions.<n>Our AutoGen-based framework employs OpenAI's o3 model with specialized agents for constraint generation, parameter validation, simulation, and optimization guidance.
arXiv Detail & Related papers (2025-06-26T01:03:44Z) - The Larger the Merrier? Efficient Large AI Model Inference in Wireless Edge Networks [56.37880529653111]
The demand for large computation model (LAIM) services is driving a paradigm shift from traditional cloud-based inference to edge-based inference for low-latency, privacy-preserving applications.<n>In this paper, we investigate the LAIM-inference scheme, where a pre-trained LAIM is pruned and partitioned into on-device and on-server sub-models for deployment.
arXiv Detail & Related papers (2025-05-14T08:18:55Z) - Efficient Split Federated Learning for Large Language Models over Communication Networks [45.02252893286613]
Fine-tuning pre-trained large language models (LLMs) in a distributed manner poses significant challenges on resource-constrained edge networks.<n>We propose SflLLM, a novel framework that integrates split federated learning with parameter-efficient fine-tuning techniques.<n>By leveraging model splitting and low-rank adaptation (LoRA), SflLLM reduces the computational burden on edge devices.
arXiv Detail & Related papers (2025-04-20T16:16:54Z) - Privacy-Aware Joint DNN Model Deployment and Partitioning Optimization for Collaborative Edge Inference Services [14.408050197587654]
Edge inference (EI) has emerged as a promising paradigm to address the growing limitations of cloud-based Deep Neural Network (DNN) inference services.<n> deploying DNN models on resource-constrained edge devices introduces additional challenges, including limited/storage resources, dynamic service demands, and heightened privacy risks.<n>This paper presents a novel privacy-aware optimization framework that jointly addresses DNN model deployment, user-server association, and model partitioning.
arXiv Detail & Related papers (2025-02-22T05:27:24Z) - Neural Horizon Model Predictive Control -- Increasing Computational Efficiency with Neural Networks [0.0]
We propose a proposed machine-learning supported approach to model predictive control.
We propose approximating part of the problem horizon, while maintaining safety guarantees.
The proposed MPC scheme can be applied to a wide range of applications, including those requiring a rapid control response.
arXiv Detail & Related papers (2024-08-19T08:13:37Z) - Conditional Denoising Diffusion for Sequential Recommendation [62.127862728308045]
Two prominent generative models, Generative Adversarial Networks (GANs) and Variational AutoEncoders (VAEs)
GANs suffer from unstable optimization, while VAEs are prone to posterior collapse and over-smoothed generations.
We present a conditional denoising diffusion model, which includes a sequence encoder, a cross-attentive denoising decoder, and a step-wise diffuser.
arXiv Detail & Related papers (2023-04-22T15:32:59Z) - Symmetric Tensor Networks for Generative Modeling and Constrained
Combinatorial Optimization [72.41480594026815]
Constrained optimization problems abound in industry, from portfolio optimization to logistics.
One of the major roadblocks in solving these problems is the presence of non-trivial hard constraints which limit the valid search space.
In this work, we encode arbitrary integer-valued equality constraints of the form Ax=b, directly into U(1) symmetric networks (TNs) and leverage their applicability as quantum-inspired generative models.
arXiv Detail & Related papers (2022-11-16T18:59:54Z) - Algorithm for Constrained Markov Decision Process with Linear
Convergence [55.41644538483948]
An agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its costs.
A new dual approach is proposed with the integration of two ingredients: entropy regularized policy and Vaidya's dual.
The proposed approach is shown to converge (with linear rate) to the global optimum.
arXiv Detail & Related papers (2022-06-03T16:26:38Z) - Combining Deep Learning and Optimization for Security-Constrained
Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z) - FISAR: Forward Invariant Safe Reinforcement Learning with a Deep Neural
Network-Based Optimize [44.65622657676026]
We take constraints as Lyapunov functions and impose new linear constraints on the policy parameters' updating dynamics.
Because the new guaranteed-feasible constraints are imposed on the updating dynamics instead of the original policy parameters, classic optimization algorithms are no longer applicable.
arXiv Detail & Related papers (2020-06-19T21:58:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.