Related papers: Learning Cascade Ranking as One Network

Learning Cascade Ranking as One Network

URL: http://arxiv.org/abs/2503.09492v1
Date: Wed, 12 Mar 2025 15:52:51 GMT
Title: Learning Cascade Ranking as One Network
Authors: Yunli Wang, Zhen Zhang, Zhiqiang Wang, Zixuan Yang, Yu Li, Jian Yang, Shiyang Wen, Peng Jiang, Kun Gai,
Abstract summary: Cascade Ranking is a prevalent architecture in large-scale top-k selection systems like recommendation and advertising platforms.<n>We propose LCRON, which introduces a novel surrogate loss function derived from the lower bound probability that ground truth items are selected by cascade ranking.<n> LCRON achieves significant improvement over existing methods on public benchmarks and industrial applications.
Score: 34.530252769521624
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Cascade Ranking is a prevalent architecture in large-scale top-k selection systems like recommendation and advertising platforms. Traditional training methods focus on single-stage optimization, neglecting interactions between stages. Recent advances such as RankFlow and FS-LTR have introduced interaction-aware training paradigms but still struggle to 1) align training objectives with the goal of the entire cascade ranking (i.e., end-to-end recall) and 2) learn effective collaboration patterns for different stages. To address these challenges, we propose LCRON, which introduces a novel surrogate loss function derived from the lower bound probability that ground truth items are selected by cascade ranking, ensuring alignment with the overall objective of the system. According to the properties of the derived bound, we further design an auxiliary loss for each stage to drive the reduction of this bound, leading to a more robust and effective top-k selection. LCRON enables end-to-end training of the entire cascade ranking system as a unified network. Experimental results demonstrate that LCRON achieves significant improvement over existing methods on public benchmarks and industrial applications, addressing key limitations in cascade ranking training and significantly enhancing system performance.

Related papers

Disentangling Locality and Entropy in Ranking Distillation [24.600506147325717]
We conduct a broad ablation of sampling and distillation processes in neural ranking.<n>We frame and theoretically derive the nature of model geometry affected by example selection.<n>We establish conditions in which data augmentation can effectively improve bias in a ranking model.
arXiv Detail & Related papers (2025-05-27T11:46:37Z)
From Pairwise to Ranking: Climbing the Ladder to Ideal Collaborative Filtering with Pseudo-Ranking [13.01752267289297]
An ideal collaborative filtering model should learn from users' full rankings over all items to make optimal top-K recommendations.<n>Most CF models rely on pairwise loss functions to approximate full rankings, resulting in an immense performance gap.<n>We propose a pseudo-ranking paradigm (PRP) that addresses the lack of ranking information by introducing pseudo-rankings supervised by an original noise injection mechanism.
arXiv Detail & Related papers (2024-12-24T05:01:16Z)
RankTower: A Synergistic Framework for Enhancing Two-Tower Pre-Ranking Model [0.0]
In large-scale ranking systems, cascading architectures have been widely adopted to achieve a balance between efficiency and effectiveness. It is crucial for the pre-ranking model to maintain a balance between efficiency and accuracy to adhere to online latency constraints. We propose a novel neural network architecture called RankTower, which is designed to efficiently capture user-item interactions.
arXiv Detail & Related papers (2024-07-17T08:07:37Z)
A Self-boosted Framework for Calibrated Ranking [7.4291851609176645]
Calibrated Ranking is a scale-calibrated ranking system that pursues accurate ranking quality and calibrated probabilistic predictions simultaneously. Previous methods need to aggregate the full candidate list within a single mini-batch to compute the ranking loss. We propose a Self-Boosted framework for Calibrated Ranking (SBCR)
arXiv Detail & Related papers (2024-06-12T09:00:49Z)
Full Stage Learning to Rank: A Unified Framework for Multi-Stage Systems [40.199257203898846]
We propose an improved ranking principle for multi-stage systems, namely the Generalized Probability Ranking Principle (GPRP) GPRP emphasizes both the selection bias in each stage of the system pipeline as well as the underlying interest of users. Our core idea is to first estimate the selection bias in the subsequent stages and then learn a ranking model that best complies with the downstream modules' selection bias.
arXiv Detail & Related papers (2024-05-08T06:35:04Z)
Rethinking Resource Management in Edge Learning: A Joint Pre-training and Fine-tuning Design Paradigm [87.47506806135746]
In some applications, edge learning is experiencing a shift in focusing from conventional learning from scratch to new two-stage learning. This paper considers the problem of joint communication and computation resource management in a two-stage edge learning system. It is shown that the proposed joint resource management over the pre-training and fine-tuning stages well balances the system performance trade-off.
arXiv Detail & Related papers (2024-04-01T00:21:11Z)
Efficient Stagewise Pretraining via Progressive Subnetworks [53.00045381931778]
The prevailing view suggests that stagewise dropping strategies, such as layer dropping, are ineffective when compared to stacking-based approaches. This paper challenges this notion by demonstrating that, with proper design, dropping strategies can be competitive, if not better, than stacking methods. We propose an instantiation of this framework - Random Part Training (RAPTR) - that selects and trains only a random subnetwork at each step, progressively increasing the size in stages.
arXiv Detail & Related papers (2024-02-08T18:49:09Z)
Learning Fair Ranking Policies via Differentiable Optimization of Ordered Weighted Averages [55.04219793298687]
This paper shows how efficiently-solvable fair ranking models can be integrated into the training loop of Learning to Rank. In particular, this paper is the first to show how to backpropagate through constrained optimizations of OWA objectives, enabling their use in integrated prediction and decision models.
arXiv Detail & Related papers (2024-02-07T20:53:53Z)
PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation [65.268245109828]
We introduce PRILoRA, which linearly allocates a different rank for each layer, in an increasing manner, and performs pruning throughout the training process. We validate the effectiveness of PRILoRA through extensive experiments on eight GLUE benchmarks, setting a new state of the art.
arXiv Detail & Related papers (2024-01-20T20:25:17Z)
Adaptive Neural Ranking Framework: Toward Maximized Business Goal for Cascade Ranking Systems [33.46891569350896]
Cascade ranking is widely used for large-scale top-k selection problems in online advertising and recommendation systems. Previous works on learning-to-rank usually focus on letting the model learn the complete order or top-k order. We name this method as Adaptive Neural Ranking Framework (abbreviated as ARF)
arXiv Detail & Related papers (2023-10-16T14:43:02Z)
Large-Scale Sequential Learning for Recommender and Engineering Systems [91.3755431537592]
In this thesis, we focus on the design of an automatic algorithms that provide personalized ranking by adapting to the current conditions. For the former, we propose novel algorithm called SAROS that take into account both kinds of feedback for learning over the sequence of interactions. The proposed idea of taking into account the neighbour lines shows statistically significant results in comparison with the initial approach for faults detection in power grid.
arXiv Detail & Related papers (2022-05-13T21:09:41Z)
Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search [60.965024145243596]
One-shot weight sharing methods have recently drawn great attention in neural architecture search due to high efficiency and competitive performance. To alleviate this problem, we present a simple yet effective architecture distillation method. We introduce the concept of prioritized path, which refers to the architecture candidates exhibiting superior performance during training. Since the prioritized paths are changed on the fly depending on their performance and complexity, the final obtained paths are the cream of the crop.
arXiv Detail & Related papers (2020-10-29T17:55:05Z)
Generalized Zero-Shot Learning Via Over-Complete Distribution [79.5140590952889]
We propose to generate an Over-Complete Distribution (OCD) using Conditional Variational Autoencoder (CVAE) of both seen and unseen classes. The effectiveness of the framework is evaluated using both Zero-Shot Learning and Generalized Zero-Shot Learning protocols.
arXiv Detail & Related papers (2020-04-01T19:05:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.