Adaptive Neural Ranking Framework: Toward Maximized Business Goal for
Cascade Ranking Systems
- URL: http://arxiv.org/abs/2310.10462v2
- Date: Wed, 21 Feb 2024 14:48:16 GMT
- Title: Adaptive Neural Ranking Framework: Toward Maximized Business Goal for
Cascade Ranking Systems
- Authors: Yunli Wang, Zhiqiang Wang, Jian Yang, Shiyang Wen, Dongying Kong, Han
Li, Kun Gai
- Abstract summary: Cascade ranking is widely used for large-scale top-k selection problems in online advertising and recommendation systems.
Previous works on learning-to-rank usually focus on letting the model learn the complete order or top-k order.
We name this method as Adaptive Neural Ranking Framework (abbreviated as ARF)
- Score: 33.46891569350896
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cascade ranking is widely used for large-scale top-k selection problems in
online advertising and recommendation systems, and learning-to-rank is an
important way to optimize the models in cascade ranking. Previous works on
learning-to-rank usually focus on letting the model learn the complete order or
top-k order, and adopt the corresponding rank metrics (e.g. OPA and NDCG@k) as
optimization targets. However, these targets can not adapt to various cascade
ranking scenarios with varying data complexities and model capabilities; and
the existing metric-driven methods such as the Lambda framework can only
optimize a rough upper bound of limited metrics, potentially resulting in
sub-optimal and performance misalignment. To address these issues, we propose a
novel perspective on optimizing cascade ranking systems by highlighting the
adaptability of optimization targets to data complexities and model
capabilities. Concretely, we employ multi-task learning to adaptively combine
the optimization of relaxed and full targets, which refers to metrics
Recall@m@k and OPA respectively. We also introduce permutation matrix to
represent the rank metrics and employ differentiable sorting techniques to
relax hard permutation matrix with controllable approximate error bound. This
enables us to optimize both the relaxed and full targets directly and more
appropriately. We named this method as Adaptive Neural Ranking Framework
(abbreviated as ARF). Furthermore, we give a specific practice under ARF. We
use the NeuralSort to obtain the relaxed permutation matrix and draw on the
variant of the uncertainty weight method in multi-task learning to optimize the
proposed losses jointly. Experiments on a total of 4 public and industrial
benchmarks show the effectiveness and generalization of our method, and online
experiment shows that our method has significant application value.
Related papers
- Ordinal Preference Optimization: Aligning Human Preferences via NDCG [28.745322441961438]
We develop an end-to-end preference optimization algorithm by approxing NDCG with a differentiable surrogate loss.
OPO outperforms existing pairwise and listwise approaches on evaluation sets and general benchmarks like AlpacaEval.
arXiv Detail & Related papers (2024-10-06T03:49:28Z) - An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting [53.36437745983783]
We first construct a max-margin optimization-based model to model potentially non-monotonic preferences.
We devise information amount measurement methods and question selection strategies to pinpoint the most informative alternative in each iteration.
Two incremental preference elicitation-based algorithms are developed to learn potentially non-monotonic preferences.
arXiv Detail & Related papers (2024-09-04T14:36:20Z) - Decoding-Time Language Model Alignment with Multiple Objectives [116.42095026960598]
Existing methods primarily focus on optimizing LMs for a single reward function, limiting their adaptability to varied objectives.
Here, we propose $textbfmulti-objective decoding (MOD)$, a decoding-time algorithm that outputs the next token from a linear combination of predictions.
We show why existing approaches can be sub-optimal even in natural settings and obtain optimality guarantees for our method.
arXiv Detail & Related papers (2024-06-27T02:46:30Z) - Adaptive Preference Scaling for Reinforcement Learning with Human Feedback [103.36048042664768]
Reinforcement learning from human feedback (RLHF) is a prevalent approach to align AI systems with human values.
We propose a novel adaptive preference loss, underpinned by distributionally robust optimization (DRO)
Our method is versatile and can be readily adapted to various preference optimization frameworks.
arXiv Detail & Related papers (2024-06-04T20:33:22Z) - Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric.
We propose a single framework built on their equivalence in learning scenarios.
Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z) - Learning Regions of Interest for Bayesian Optimization with Adaptive
Level-Set Estimation [84.0621253654014]
We propose a framework, called BALLET, which adaptively filters for a high-confidence region of interest.
We show theoretically that BALLET can efficiently shrink the search space, and can exhibit a tighter regret bound than standard BO.
arXiv Detail & Related papers (2023-07-25T09:45:47Z) - Agent-based Collaborative Random Search for Hyper-parameter Tuning and
Global Function Optimization [0.0]
This paper proposes an agent-based collaborative technique for finding near-optimal values for any arbitrary set of hyper- parameters in a machine learning model.
The behavior of the presented model, specifically against the changes in its design parameters, is investigated in both machine learning and global function optimization applications.
arXiv Detail & Related papers (2023-03-03T21:10:17Z) - Amortized Proximal Optimization [11.441395750267052]
Amortized Proximal Optimization (APO) is a framework for online meta-optimization of parameters that govern optimization.
We show how APO can be used to adapt a learning rate or a structured preconditioning matrix.
We empirically test APO for online adaptation of learning rates and structured preconditioning for regression, image reconstruction, image classification, and natural language translation tasks.
arXiv Detail & Related papers (2022-02-28T20:50:48Z) - Stochastic batch size for adaptive regularization in deep network
optimization [63.68104397173262]
We propose a first-order optimization algorithm incorporating adaptive regularization applicable to machine learning problems in deep learning framework.
We empirically demonstrate the effectiveness of our algorithm using an image classification task based on conventional network models applied to commonly used benchmark datasets.
arXiv Detail & Related papers (2020-04-14T07:54:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.