Are We Really Achieving Better Beyond-Accuracy Performance in Next Basket Recommendation?
- URL: http://arxiv.org/abs/2405.01143v1
- Date: Thu, 02 May 2024 09:59:35 GMT
- Title: Are We Really Achieving Better Beyond-Accuracy Performance in Next Basket Recommendation?
- Authors: Ming Li, Yuanna Liu, Sami Jullien, Mozhdeh Ariannezhad, Mohammad Aliannejadi, Andrew Yates, Maarten de Rijke,
- Abstract summary: Next basket recommendation (NBR) is a special type of sequential recommendation that is increasingly receiving attention.
Recent studies into NBR have found a substantial performance difference between recommending repeat items and explore items.
We propose a plug-and-play two-step repetition-exploration framework that treats repeat items and explores items separately.
- Score: 57.91114305844153
- License:
- Abstract: Next basket recommendation (NBR) is a special type of sequential recommendation that is increasingly receiving attention. So far, most NBR studies have focused on optimizing the accuracy of the recommendation, whereas optimizing for beyond-accuracy metrics, e.g., item fairness and diversity remains largely unexplored. Recent studies into NBR have found a substantial performance difference between recommending repeat items and explore items. Repeat items contribute most of the users' perceived accuracy compared with explore items. Informed by these findings, we identify a potential "short-cut" to optimize for beyond-accuracy metrics while maintaining high accuracy. To leverage and verify the existence of such short-cuts, we propose a plug-and-play two-step repetition-exploration (TREx) framework that treats repeat items and explores items separately, where we design a simple yet highly effective repetition module to ensure high accuracy, while two exploration modules target optimizing only beyond-accuracy metrics. Experiments are performed on two widely-used datasets w.r.t. a range of beyond-accuracy metrics, viz. five fairness metrics and three diversity metrics. Our experimental results verify the effectiveness of TREx. Prima facie, this appears to be good news: we can achieve high accuracy and improved beyond-accuracy metrics at the same time. However, we argue that the real-world value of our algorithmic solution, TREx, is likely to be limited and reflect on the reasonableness of the evaluation setup. We end up challenging existing evaluation paradigms, particularly in the context of beyond-accuracy metrics, and provide insights for researchers to navigate potential pitfalls and determine reasonable metrics to consider when optimizing for accuracy and beyond-accuracy metrics.
Related papers
- Contextual Distillation Model for Diversified Recommendation [19.136439564988834]
Contextual Distillation Model (CDM) is an efficient recommendation model that addresses diversification.
We propose a contrastive context encoder that employs attention mechanisms to model both positive and negative contexts.
During inference, ranking is performed through a linear combination of the recommendation and student model scores.
arXiv Detail & Related papers (2024-06-13T11:55:40Z) - Preference Learning Algorithms Do Not Learn Preference Rankings [62.335733662381884]
We study the conventional wisdom that preference learning trains models to assign higher likelihoods to more preferred outputs than less preferred outputs.
We find that most state-of-the-art preference-tuned models achieve a ranking accuracy of less than 60% on common preference datasets.
arXiv Detail & Related papers (2024-05-29T21:29:44Z) - Aligning GPTRec with Beyond-Accuracy Goals with Reinforcement Learning [67.71952251641545]
GPTRec is an alternative to the Top-K model for item-by-item recommendations.
We show that GPTRec offers a better tradeoff between accuracy and secondary metrics than classic greedy re-ranking techniques.
Our experiments on two datasets show that GPTRec's Next-K generation approach offers a better tradeoff between accuracy and secondary metrics than classic greedy re-ranking techniques.
arXiv Detail & Related papers (2024-03-07T19:47:48Z) - Lower-Left Partial AUC: An Effective and Efficient Optimization Metric
for Recommendation [52.45394284415614]
We propose a new optimization metric, Lower-Left Partial AUC (LLPAUC), which is computationally efficient like AUC but strongly correlates with Top-K ranking metrics.
LLPAUC considers only the partial area under the ROC curve in the Lower-Left corner to push the optimization focus on Top-K.
arXiv Detail & Related papers (2024-02-29T13:58:33Z) - Recommendation Systems with Distribution-Free Reliability Guarantees [83.80644194980042]
We show how to return a set of items rigorously guaranteed to contain mostly good items.
Our procedure endows any ranking model with rigorous finite-sample control of the false discovery rate.
We evaluate our methods on the Yahoo! Learning to Rank and MSMarco datasets.
arXiv Detail & Related papers (2022-07-04T17:49:25Z) - Determinantal Point Process Likelihoods for Sequential Recommendation [12.206748373325972]
We propose two new loss functions based on the Determinantal Point Process (DPP) likelihood, that can be adaptively applied to estimate the subsequent item or items.
Experimental results using the proposed loss functions on three real-world datasets show marked improvements over state-of-the-art sequential recommendation methods in both quality and diversity metrics.
arXiv Detail & Related papers (2022-04-25T11:20:10Z) - Efficient Exploration in Binary and Preferential Bayesian Optimization [0.5076419064097732]
We show that it is important for BO algorithms to distinguish between different types of uncertainty.
We propose several new acquisition functions that outperform state-of-the-art BO functions.
arXiv Detail & Related papers (2021-10-18T14:44:34Z) - Reenvisioning Collaborative Filtering vs Matrix Factorization [65.74881520196762]
Collaborative filtering models based on matrix factorization and learned similarities using Artificial Neural Networks (ANNs) have gained significant attention in recent years.
Announcement of ANNs within the recommendation ecosystem has been recently questioned, raising several comparisons in terms of efficiency and effectiveness.
We show the potential these techniques may have on beyond-accuracy evaluation while analyzing effect on complementary evaluation dimensions.
arXiv Detail & Related papers (2021-07-28T16:29:38Z) - SQE: a Self Quality Evaluation Metric for Parameters Optimization in
Multi-Object Tracking [25.723436561224297]
We present a novel self quality evaluation metric SQE for parameters optimization in the challenging yet critical multi-object tracking task.
By contrast, our metric reflects the internal characteristics of trajectory hypotheses and measures tracking performance without ground truth.
arXiv Detail & Related papers (2020-04-16T06:07:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.