Linear-PAL: A Lightweight Ranker for Mitigating Shortcut Learning in Personalized, High-Bias Tabular Ranking
- URL: http://arxiv.org/abs/2602.00013v1
- Date: Mon, 15 Dec 2025 12:06:04 GMT
- Title: Linear-PAL: A Lightweight Ranker for Mitigating Shortcut Learning in Personalized, High-Bias Tabular Ranking
- Authors: Vipul Dinesh Pawar,
- Abstract summary: In e-commerce ranking, implicit user feedback is systematically confounded by Position Bias.<n>We propose a lightweight framework that enforces de-biasing through structural constraints.<n>We show that Linear-PAL achieves robust, personalized ranking in near real-time.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In e-commerce ranking, implicit user feedback is systematically confounded by Position Bias -- the strong propensity of users to interact with top-ranked items regardless of relevance. While Deep Learning architectures (e.g., Two-Tower Networks) are the standard solution for de-biasing, we demonstrate that in High-Bias Regimes, state-of-the-art Deep Ensembles suffer from Shortcut Learning: they minimize training loss by overfitting to the rank signal, leading to degraded ranking quality despite high prediction accuracy. We propose Linear Position-bias Aware Learning (Linear-PAL), a lightweight framework that enforces de-biasing through structural constraints: explicit feature conjunctions and aggressive regularization. We further introduce a Vectorized Integer Hashing technique for feature generation, replacing string-based operations with $O(N)$ vectorized arithmetic. Evaluating on a large-scale dataset (4.2M samples), Linear-PAL achieves Pareto Dominance: it outperforms Deep Ensembles in de-biased ranking quality (Relevance AUC: 0.7626 vs. 0.6736) while reducing training latency by 43x (40s vs 1762s). This computational efficiency enables high-frequency retraining, allowing the system to capture user-specific emerging market trends and deliver robust, personalized ranking in near real-time.
Related papers
- Dynamic Rank Reinforcement Learning for Adaptive Low-Rank Multi-Head Self Attention in Large Language Models [0.0]
We propose Dynamic Rank Reinforcement Learning (DR-RL), a novel framework that adaptively optimize the low-rank factorization of Multi-Head Self-Attention (MHSA) in Large Language Models (LLMs)<n>DR-RL maintains downstream accuracy statistically equivalent to full-rank attention while significantly reducing Floating Point Operations (FLOPs)<n>This work bridges the gap between adaptive efficiency and theoretical rigor in MHSA, offering a principled, mathematically grounded alternative to rank reduction techniques in resource-constrained deep learning.
arXiv Detail & Related papers (2025-12-17T21:09:19Z) - Feedback Alignment Meets Low-Rank Manifolds: A Structured Recipe for Local Learning [7.034739490820967]
Training deep neural networks (DNNs) with backpropagation (BP) achieves state-of-the-art accuracy but requires global error propagation and full parameterization.<n>Direct Feedback Alignment (DFA) enables local, parallelizable updates with lower memory requirements.<n>We propose a structured local learning framework that operates directly on low-rank manifold.
arXiv Detail & Related papers (2025-10-29T15:03:46Z) - LLMs for estimating positional bias in logged interaction data [44.839172857330674]
We propose a novel method for estimating position bias using Large Language Models (LLMs)<n>Our experiments show that propensities estimated with our LLM-as-a-judge approach are stable across score buckets.<n>An IPS-weighted reranker trained with these propensities matches the production model on standard NDCG@10 while improving weighted NDCG@10 by roughly 2%.
arXiv Detail & Related papers (2025-09-03T20:26:06Z) - RewardRank: Optimizing True Learning-to-Rank Utility [28.662272762911325]
We introduce RewardRank, a data-driven learning-to-rank framework for counterfactual utility.<n>Our results show that learning-to-rank can be reformulated as direct optimization of counterfactual utility.
arXiv Detail & Related papers (2025-08-19T18:08:35Z) - Optimizing Preference Alignment with Differentiable NDCG Ranking [9.594183083553245]
Recent studies have uncovered a substantial discrepancy between the theoretical aspirations of preference learning and its real-world results.
This paper introduces underlineDirect underlineRanking underlinePreference underlineOptimization (O), a novel method that views human preference alignment as a Learning-to-Rank task.
arXiv Detail & Related papers (2024-10-17T08:54:57Z) - Adaptive Shortcut Debiasing for Online Continual Learning [37.18171171654679]
We propose a novel framework that suppresses the shortcut bias in online continual learning (OCL)
By the observed high-attention property of the shortcut bias, highly-activated features are considered candidates for debiasing.
Experiments on five benchmark datasets demonstrate that, when combined with various OCL algorithms, DropTop increases the average accuracy by up to 10.4% and decreases the forgetting by up to 63.2%.
arXiv Detail & Related papers (2023-12-14T06:27:31Z) - Meta-Learning Adversarial Bandit Algorithms [55.72892209124227]
We study online meta-learning with bandit feedback.
We learn to tune online mirror descent generalization (OMD) with self-concordant barrier regularizers.
arXiv Detail & Related papers (2023-07-05T13:52:10Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - Improving Calibration for Long-Tailed Recognition [68.32848696795519]
We propose two methods to improve calibration and performance in such scenarios.
For dataset bias due to different samplers, we propose shifted batch normalization.
Our proposed methods set new records on multiple popular long-tailed recognition benchmark datasets.
arXiv Detail & Related papers (2021-04-01T13:55:21Z) - Accelerated Convergence for Counterfactual Learning to Rank [65.63997193915257]
We show that convergence rate of SGD approaches with IPS-weighted gradients suffers from the large variance introduced by the IPS weights.
We propose a novel learning algorithm, called CounterSample, that has provably better convergence than standard IPS-weighted gradient descent methods.
We prove that CounterSample converges faster and complement our theoretical findings with empirical results.
arXiv Detail & Related papers (2020-05-21T12:53:36Z) - Learning to Hash with Graph Neural Networks for Recommender Systems [103.82479899868191]
Graph representation learning has attracted much attention in supporting high quality candidate search at scale.
Despite its effectiveness in learning embedding vectors for objects in the user-item interaction network, the computational costs to infer users' preferences in continuous embedding space are tremendous.
We propose a simple yet effective discrete representation learning framework to jointly learn continuous and discrete codes.
arXiv Detail & Related papers (2020-03-04T06:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.