Related papers: Addressing Class-Imbalance Problem in Personalized Ranking

Addressing Class-Imbalance Problem in Personalized Ranking

URL: http://arxiv.org/abs/2005.09272v2
Date: Tue, 8 Sep 2020 08:47:20 GMT
Title: Addressing Class-Imbalance Problem in Personalized Ranking
Authors: Lu Yu, Shichao Pei, Chuxu Zhang, Shangsong Liang, Xiao Bai, Nitesh Chawla, Xiangliang Zhang
Abstract summary: We propose an efficient emphunderlineVital underlineNegative underlineSampler (VINS) to alleviate the class-imbalance issue for pairwise ranking model. VINS is a bias sampler with reject probability that will tend to accept a negative candidate with a larger degree weight than the given positive item.
Score: 47.11372043636176
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Pairwise ranking models have been widely used to address recommendation problems. The basic idea is to learn the rank of users' preferred items through separating items into \emph{positive} samples if user-item interactions exist, and \emph{negative} samples otherwise. Due to the limited number of observable interactions, pairwise ranking models face serious \emph{class-imbalance} issues. Our theoretical analysis shows that current sampling-based methods cause the vertex-level imbalance problem, which makes the norm of learned item embeddings towards infinite after a certain training iterations, and consequently results in vanishing gradient and affects the model inference results. We thus propose an efficient \emph{\underline{Vi}tal \underline{N}egative \underline{S}ampler} (VINS) to alleviate the class-imbalance issue for pairwise ranking model, in particular for deep learning models optimized by gradient methods. The core of VINS is a bias sampler with reject probability that will tend to accept a negative candidate with a larger degree weight than the given positive item. Evaluation results on several real datasets demonstrate that the proposed sampling method speeds up the training procedure 30\% to 50\% for ranking models ranging from shallow to deep, while maintaining and even improving the quality of ranking results in top-N item recommendation.

Related papers

Discrete Scale-invariant Metric Learning for Efficient Collaborative Filtering [16.608428889271202]
Current metric learning methods aim to push negative items away from the corresponding users and positive items by an absolute geometrical distance margin.<n>We propose a new method, named discrete scale-invariant metric learning (DSIML), by adding binary constraints to users and items.<n> Experiments on benchmark datasets clearly show that our proposed method is superior to competitive metric learning and hashing-based baselines for recommender systems.
arXiv Detail & Related papers (2025-06-11T16:13:52Z)
Optimal Embedding Guided Negative Sample Generation for Knowledge Graph Link Prediction [7.961256253228863]
This paper theoretically investigates the condition under which negative samples lead to optimal KG embedding. We propose textscEMU, a novel framework that emphgenerates negative samples satisfying this condition. The results consistently demonstrate significant improvements in link prediction performance across various KGE models and negative sampling methods.
arXiv Detail & Related papers (2025-04-04T10:10:18Z)
Evaluating Performance and Bias of Negative Sampling in Large-Scale Sequential Recommendation Models [0.0]
Large-scale industrial recommendation models predict the most relevant items from catalogs containing millions or billions of options. To train these models efficiently, a small set of irrelevant items (negative samples) is selected from the vast catalog for each relevant item. Our study serves as a practical guide to the trade-offs in selecting a negative sampling method for large-scale sequential recommendation models.
arXiv Detail & Related papers (2024-10-08T00:23:17Z)
Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice. We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z)
Learning Explicit User Interest Boundary for Recommendation [5.715918678913698]
We introduce an auxiliary score $b_u$ for each user to represent the User Interest Boundary. We show that our approach can provide a personalized decision boundary and significantly improve the training efficiency without any special sampling strategy.
arXiv Detail & Related papers (2021-11-22T07:26:51Z)
Rethinking InfoNCE: How Many Negative Samples Do You Need? [54.146208195806636]
We study how many negative samples are optimal for InfoNCE in different scenarios via a semi-quantitative theoretical framework. We estimate the optimal negative sampling ratio using the $K$ value that maximizes the training effectiveness function.
arXiv Detail & Related papers (2021-05-27T08:38:29Z)
Scalable Personalised Item Ranking through Parametric Density Estimation [53.44830012414444]
Learning from implicit feedback is challenging because of the difficult nature of the one-class problem. Most conventional methods use a pairwise ranking approach and negative samplers to cope with the one-class problem. We propose a learning-to-rank approach, which achieves convergence speed comparable to the pointwise counterpart.
arXiv Detail & Related papers (2021-05-11T03:38:16Z)
Doubly Contrastive Deep Clustering [135.7001508427597]
We present a novel Doubly Contrastive Deep Clustering (DCDC) framework, which constructs contrastive loss over both sample and class views. Specifically, for the sample view, we set the class distribution of the original sample and its augmented version as positive sample pairs. For the class view, we build the positive and negative pairs from the sample distribution of the class. In this way, two contrastive losses successfully constrain the clustering results of mini-batch samples in both sample and class level.
arXiv Detail & Related papers (2021-03-09T15:15:32Z)
Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data. There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups. We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z)
Extreme Classification via Adversarial Softmax Approximation [23.943134990807756]
We propose a simple training method for drastically enhancing the gradient signal by drawing negative samples from an adversarial model. Our contributions are three-fold: (i) an adversarial sampling mechanism that produces negative samples at a cost only logarithmic in $C$, thus still resulting in cheap gradient updates; (ii) a mathematical proof that this adversarial sampling minimizes the gradient variance while any bias due to non-uniform sampling can be removed; (iii) experimental results on large scale data sets that show a reduction of the training time by an order of magnitude relative to several competitive baselines.
arXiv Detail & Related papers (2020-02-15T01:42:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.