Related papers: Multi-Label Learning to Rank through Multi-Objective Optimization

Multi-Label Learning to Rank through Multi-Objective Optimization

URL: http://arxiv.org/abs/2207.03060v2
Date: Fri, 8 Jul 2022 16:30:43 GMT
Title: Multi-Label Learning to Rank through Multi-Objective Optimization
Authors: Debabrata Mahapatra, Chaosheng Dong, Yetian Chen, Deqiang Meng, Michinari Momma
Abstract summary: Learning to Rank technique is ubiquitous in the Information Retrieval system nowadays. To resolve ambiguity, it is desirable to train a model using many relevance criteria. We propose a general framework where the information from labels can be combined in a variety of ways to characterize the trade-off among the goals.
Score: 9.099663022952496
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning to Rank (LTR) technique is ubiquitous in the Information Retrieval system nowadays, especially in the Search Ranking application. The query-item relevance labels typically used to train the ranking model are often noisy measurements of human behavior, e.g., product rating for product search. The coarse measurements make the ground truth ranking non-unique with respect to a single relevance criterion. To resolve ambiguity, it is desirable to train a model using many relevance criteria, giving rise to Multi-Label LTR (MLLTR). Moreover, it formulates multiple goals that may be conflicting yet important to optimize for simultaneously, e.g., in product search, a ranking model can be trained based on product quality and purchase likelihood to increase revenue. In this research, we leverage the Multi-Objective Optimization (MOO) aspect of the MLLTR problem and employ recently developed MOO algorithms to solve it. Specifically, we propose a general framework where the information from labels can be combined in a variety of ways to meaningfully characterize the trade-off among the goals. Our framework allows for any gradient based MOO algorithm to be used for solving the MLLTR problem. We test the proposed framework on two publicly available LTR datasets and one e-commerce dataset to show its efficacy.

Related papers

CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward [50.97588334916863]
We develop CompassVerifier, an accurate and robust lightweight verifier model for evaluation and outcome reward.<n>It demonstrates multi-domain competency spanning math, knowledge, and diverse reasoning tasks, with the capability to process various answer types.<n>We introduce VerifierBench benchmark comprising model outputs collected from multiple data sources, augmented through manual analysis of metaerror patterns to enhance CompassVerifier.
arXiv Detail & Related papers (2025-08-05T17:55:24Z)
Generative Retrieval and Alignment Model: A New Paradigm for E-commerce Retrieval [12.705202836685189]
This paper introduces a novel e-commerce retrieval paradigm: the Generative Retrieval and Alignment Model (GRAM) GRAM employs joint training on text information from both queries and products to generate shared text codes. GRAM significantly outperforms traditional models and the latest generative retrieval models.
arXiv Detail & Related papers (2025-04-02T06:40:09Z)
Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning [76.50690734636477]
We introduce Rank-R1, a novel LLM-based reranker that performs reasoning over both the user query and candidate documents before performing the ranking task. Our experiments on the TREC DL and BRIGHT datasets show that Rank-R1 is highly effective, especially for complex queries.
arXiv Detail & Related papers (2025-03-08T03:14:26Z)
Automated Query-Product Relevance Labeling using Large Language Models for E-commerce Search [3.392843594990172]
Traditional approaches for annotating query-product pairs rely on human-based labeling services. We show that Large Language Models (LLMs) can approach human-level accuracy on this task in a fraction of the time and cost required by human-labelers. This scalable alternative to human-annotation has significant implications for information retrieval domains.
arXiv Detail & Related papers (2025-02-21T22:59:36Z)
Large Language Model as Universal Retriever in Industrial-Scale Recommender System [27.58251380192748]
We show that Large Language Models (LLMs) can function as universal retrievers, capable of handling multiple objectives within a generative retrieval framework.<n>We also introduce matrix decomposition to boost model learnability, discriminability, and transferability.<n>Our Universal Retrieval Model (URM) can adaptively generate a set from computation of tens of millions of candidates.
arXiv Detail & Related papers (2025-02-05T09:56:52Z)
Ranked from Within: Ranking Large Multimodal Models for Visual Question Answering Without Labels [64.94853276821992]
Large multimodal models (LMMs) are increasingly deployed across diverse applications. Traditional evaluation methods are largely dataset-centric, relying on fixed, labeled datasets and supervised metrics. We explore unsupervised model ranking for LMMs by leveraging their uncertainty signals, such as softmax probabilities.
arXiv Detail & Related papers (2024-12-09T13:05:43Z)
Star-Agents: Automatic Data Optimization with LLM Agents for Instruction Tuning [71.2981957820888]
We propose a novel Star-Agents framework, which automates the enhancement of data quality across datasets. The framework initially generates diverse instruction data with multiple LLM agents through a bespoke sampling method. The generated data undergo a rigorous evaluation using a dual-model method that assesses both difficulty and quality.
arXiv Detail & Related papers (2024-11-21T02:30:53Z)
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path. The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z)
CROSS-JEM: Accurate and Efficient Cross-encoders for Short-text Ranking Tasks [12.045202648316678]
Transformer-based ranking models are the state-of-the-art approaches for such tasks. We propose Cross-encoders with Joint Efficient Modeling (CROSS-JEM) CROSS-JEM enables transformer-based models to jointly score multiple items for a query. It achieves state-of-the-art accuracy and over 4x lower ranking latency over standard cross-encoders.
arXiv Detail & Related papers (2024-09-15T17:05:35Z)
Large Language Model-guided Document Selection [23.673690115025913]
Large Language Model (LLM) pre-training exhausts an ever growing compute budget. Recent research has demonstrated that careful document selection enables comparable model quality with only a fraction of the FLOPs. We explore a promising direction for scalable general-domain document selection.
arXiv Detail & Related papers (2024-06-07T04:52:46Z)
Large Language Models for Relevance Judgment in Product Search [48.56992980315751]
High relevance of retrieved and re-ranked items to the search query is the cornerstone of successful product search. We present an array of techniques for leveraging Large Language Models (LLMs) for automating the relevance judgment of query-item pairs (QIPs) at scale. Our findings have immediate implications for the growing field of relevance judgment automation in product search.
arXiv Detail & Related papers (2024-06-01T00:52:41Z)
Large Language Models are Zero-Shot Rankers for Recommender Systems [76.02500186203929]
This work aims to investigate the capacity of large language models (LLMs) to act as the ranking model for recommender systems. We show that LLMs have promising zero-shot ranking abilities but struggle to perceive the order of historical interactions. We demonstrate that these issues can be alleviated using specially designed prompting and bootstrapping strategies.
arXiv Detail & Related papers (2023-05-15T17:57:39Z)
Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents [56.104476412839944]
Large Language Models (LLMs) have demonstrated remarkable zero-shot generalization across various language-related tasks. This paper investigates generative LLMs for relevance ranking in Information Retrieval (IR) To address concerns about data contamination of LLMs, we collect a new test set called NovelEval. To improve efficiency in real-world applications, we delve into the potential for distilling the ranking capabilities of ChatGPT into small specialized models.
arXiv Detail & Related papers (2023-04-19T10:16:03Z)
Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and Personalized Federated Learning [56.17603785248675]
Model-agnostic meta-learning (MAML) has become a popular research area. Existing MAML algorithms rely on the episode' idea by sampling a few tasks and data points to update the meta-model at each iteration. This paper proposes memory-based algorithms for MAML that converge with vanishing error.
arXiv Detail & Related papers (2021-06-09T08:47:58Z)
Sample-Rank: Weak Multi-Objective Recommendations Using Rejection Sampling [0.5156484100374059]
We introduce a method involving multi-goal sampling followed by ranking for user-relevance (Sample-Rank) to nudge recommendations towards multi-objective goals of the marketplace. The proposed method's novelty is that it reduces the MO recommendation problem to sampling from a desired multi-goal distribution then using it to build a production-friendly learning-to-rank model.
arXiv Detail & Related papers (2020-08-24T09:17:18Z)
Analysis of Multivariate Scoring Functions for Automatic Unbiased Learning to Rank [14.827143632277274]
AutoULTR algorithms that jointly learn user bias models (i.e., propensity models) with unbiased rankers have received a lot of attention due to their superior performance and low deployment cost in practice. Recent advances in context-aware learning-to-rank models have shown that multivariate scoring functions, which read multiple documents together and predict their ranking scores jointly, are more powerful than uni-variate ranking functions in ranking tasks with human-annotated relevance labels. Our experiments with synthetic clicks on two large-scale benchmark datasets show that AutoULTR models with permutation-invariant multivariate scoring functions significantly outperform
arXiv Detail & Related papers (2020-08-20T16:31:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.