Related papers: Learning Fair Ranking Policies via Differentiable Optimization of Ordered Weighted Averages

Learning Fair Ranking Policies via Differentiable Optimization of Ordered Weighted Averages

URL: http://arxiv.org/abs/2402.05252v1
Date: Wed, 7 Feb 2024 20:53:53 GMT
Title: Learning Fair Ranking Policies via Differentiable Optimization of Ordered Weighted Averages
Authors: My H. Dinh, James Kotary, Ferdinando Fioretto
Abstract summary: This paper shows how efficiently-solvable fair ranking models can be integrated into the training loop of Learning to Rank. In particular, this paper is the first to show how to backpropagate through constrained optimizations of OWA objectives, enabling their use in integrated prediction and decision models.
Score: 55.04219793298687
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning to Rank (LTR) is one of the most widely used machine learning applications. It is a key component in platforms with profound societal impacts, including job search, healthcare information retrieval, and social media content feeds. Conventional LTR models have been shown to produce biases results, stimulating a discourse on how to address the disparities introduced by ranking systems that solely prioritize user relevance. However, while several models of fair learning to rank have been proposed, they suffer from deficiencies either in accuracy or efficiency, thus limiting their applicability to real-world ranking platforms. This paper shows how efficiently-solvable fair ranking models, based on the optimization of Ordered Weighted Average (OWA) functions, can be integrated into the training loop of an LTR model to achieve favorable balances between fairness, user utility, and runtime efficiency. In particular, this paper is the first to show how to backpropagate through constrained optimizations of OWA objectives, enabling their use in integrated prediction and decision models.

Related papers

Enhancing Training Data Attribution with Representational Optimization [57.61977909113113]
Training data attribution methods aim to measure how training data impacts a model's predictions.<n>We propose AirRep, a representation-based approach that closes this gap by learning task-specific and model-aligned representations explicitly for TDA.<n>AirRep introduces two key innovations: a trainable encoder tuned for attribution quality, and an attention-based pooling mechanism that enables accurate estimation of group-wise influence.
arXiv Detail & Related papers (2025-05-24T05:17:53Z)
In-context Ranking Preference Optimization [48.36442791241395]
We propose an In-context Ranking Preference Optimization (IRPO) framework to optimize large language models (LLMs) based on ranking lists constructed during inference. We show IRPO outperforms standard DPO approaches in ranking performance, highlighting its effectiveness in aligning LLMs with direct in-context ranking preferences.
arXiv Detail & Related papers (2025-04-21T23:06:12Z)
Optimizing LLMs with Direct Preferences: A Data Efficiency Perspective [4.548047308860141]
This study investigates the impact of different type of preference data on model performance. It aims to reduce their dependency on extensive amounts of preference data, which is expensive to collect.
arXiv Detail & Related papers (2024-10-22T00:11:41Z)
Preference Learning Algorithms Do Not Learn Preference Rankings [62.335733662381884]
We study the conventional wisdom that preference learning trains models to assign higher likelihoods to more preferred outputs than less preferred outputs. We find that most state-of-the-art preference-tuned models achieve a ranking accuracy of less than 60% on common preference datasets.
arXiv Detail & Related papers (2024-05-29T21:29:44Z)
Prior Constraints-based Reward Model Training for Aligning Large Language Models [58.33118716810208]
This paper proposes a Prior Constraints-based Reward Model (namely PCRM) training method to mitigate this problem. PCRM incorporates prior constraints, specifically, length ratio and cosine similarity between outputs of each comparison pair, during reward model training to regulate optimization magnitude and control score margins. Experimental results demonstrate that PCRM significantly improves alignment performance by effectively constraining reward score scaling.
arXiv Detail & Related papers (2024-04-01T07:49:11Z)
ChatGPT Based Data Augmentation for Improved Parameter-Efficient Debiasing of LLMs [65.9625653425636]
Large Language models (LLMs) exhibit harmful social biases. This work introduces a novel approach utilizing ChatGPT to generate synthetic training data.
arXiv Detail & Related papers (2024-02-19T01:28:48Z)
Optimizing Group-Fair Plackett-Luce Ranking Models for Relevance and Ex-Post Fairness [5.349671569838342]
In learning-to-rank, optimizing only the relevance can cause representational harm to certain categories of items. In this paper, we propose a novel algorithm that maximizes expected relevance over those rankings that satisfy given representation constraints.
arXiv Detail & Related papers (2023-08-25T08:27:43Z)
Confidence Ranking for CTR Prediction [11.071444869776725]
We propose a novel framework, named Confidence Ranking, which designs the optimization objective as a ranking function. Our experiments show that the introduction of confidence ranking loss can outperform all baselines on the CTR prediction tasks of public and industrial datasets. This framework has been deployed in the advertisement system of JD.com to serve the main traffic in the fine-rank stage.
arXiv Detail & Related papers (2023-06-28T07:31:00Z)
CLIPood: Generalizing CLIP to Out-of-Distributions [73.86353105017076]
Contrastive language-image pre-training (CLIP) models have shown impressive zero-shot ability, but the further adaptation of CLIP on downstream tasks undesirably degrades OOD performances. We propose CLIPood, a fine-tuning method that can adapt CLIP models to OOD situations where both domain shifts and open classes may occur on unseen test data. Experiments on diverse datasets with different OOD scenarios show that CLIPood consistently outperforms existing generalization techniques.
arXiv Detail & Related papers (2023-02-02T04:27:54Z)
Learning-To-Ensemble by Contextual Rank Aggregation in E-Commerce [8.067201256886733]
We propose a new Learning-To-Ensemble framework RAEGO, which replaces the ensemble model with a contextual Rank Aggregator. RA-EGO has been deployed in our online system and has improved the revenue significantly.
arXiv Detail & Related papers (2021-07-19T03:24:06Z)
Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data. There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups. We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.